A story of compromises and types
Let’s play with an immutable Queue type. We want to:
- create an empty queue
- push things into a queue and get a new queue with the added element
- pop the next element of the queue and also get the rest of the queue.
So something like this:
If we jump into creating a Queue class we will get the following skeleton:
This could work. However, there are some kinds of programs that will compile, but will always fail to run:
Going in a a similar direction of the NullPointerException, we could try to split the queue values that will help us move from this
EmptyQueueRuntimeError to a compile error. For that, we need to differentiate the
EmptyQueue from the non-empty Queues.
But is it really useful?
The compile error is because
typeof(q3) :: EmptyQueue | Queue.
The fact that popping from a nonempty queue may lead to an empty queue stands between us and what we want to do. A “may lead” is translated to the return type since it is one of the possible results in runtime.
We could try to do something crazy. What if
Queue contains the amount of elements in its type. We know that if you:
- push an element to a
Queue(N), you will get a
- pop from a
Queue(1), you will get an
- pop from a
N > 1, you wil get a
It seems reasonable (and a bit crazy).
In a static typed language the compiler will use the types of the expressions. It won’t, since it can’t, rely on their actual values because those will only appear during execution. Programmers can deduce a certain behaviour of a program that holds for all possible executions, but this analysis goes beyond just the types of the expressions. This is something hard to realise if you are only used to dynamic languages. What we are trying to do is to take a bit more of information from the queues and add it to the
Queue(N) type, so that the compiler will be able to use the information of the types to decide if the program does actually makes sense and hence compiles.
For this we will need:
- A way to declare overloads
N > 1)
- Be able to use Math operators in types: given
- And teach the type inference how to deduce these things so we don’t need to always write them.
Even if we added this, it’s a risky business: Let’s imagine we want a
#filter operation that will remove from the queue all elements that are equal to a certain value. What will be the return type of
Queue(N) | Queue(N-1) | Queue(N-2) | .... | Queue(1) | EmptyQueue.
R=#count of e in self.
See? scary 👻.
Types are powerful stuff.
There is a lot of theory behind type systems. This story grazes Dependent types system.
Types group values and serve as a basis for static analysis. Analysis that won’t depend on the actual values.
If the type goes into too much detail, then it will be a pain to use. It the type ignores lots of details, then we will get lots of exceptions in runtime.
This situation has been coming up all the time since we began designing Crystal’s type system and API. We aim to define a useful type system that will help programmers catch runtime exceptions, but we want it to be usable and easy-going. That is why:
Nilis not a value of every type
T. You need to use unions of
T | Nil(or
Array(T)is able to hold only one type (although it can be a union), but we don’t know which indices are valid in compile time.
Tuples(*T)are not as flexible as an
Array(T)but given a literal we can know if it is a valid index and which type it corresponds to in compile time.
Tuple(*T)relationship is analogous to
Queue story didn’t end up so well, it did end up well for all of the above types,
allowing us to have a happy time while crystalling. Success!