The Programming Language

A story of compromises and types

09 Sep 2016 by bcardiff

Let’s play with an immutable Queue type. We want to:

So something like this:

q = build_queue  # q = {}
new_q = q.push 2 # q = {2}
e, old_q = new_q.pop # e = 2, q = {}

If we jump into creating a Queue class we will get the following skeleton:

class Queue
  # returns a new queue with `e` at the beginning
  def push(e)
    # ...
  end

  # returns the next element and a queue
  # with the remaining elements
  # fails if queue is empty
  def pop
    raise EmptyQueueRuntimeError.new if empty?
    # ...
  end
end

This could work. However, there are some kinds of programs that will compile, but will always fail to run:

q = Queue.new
e, q = q.pop # => EmptyQueueRuntimeError :-(

Going in a a similar direction of the NullPointerException, we could try to split the queue values that will help us move from this EmptyQueueRuntimeError to a compile error. For that, we need to differentiate the EmptyQueue from the non-empty Queues.

class EmptyQueue
  # Always return a Queue
  def push(e)
  end
end

class Queue
  # Always return a Queue
  def push(e)
    # ...
  end

  # Return Queue or EmptyQueue
  def pop
  end
end

q = EmptyQueue.new
q.pop # => Compile Error :-) EmptyQueue does not have EmptyQueue#pop

But is it really useful?

q0 = EmptyQueue.new   # q0 = {}
q1 = q0.push 1        # q1 = {1}
q2 = q1.push 2        # q2 = {2, 1}
e, q3 = q2.pop        # q3 = {1}, e = 2
e, q4 = q3.pop        # => Compile Error :-( , but *we* know q3 is not empty...

The compile error is because typeof(q3) :: EmptyQueue | Queue.

The fact that popping from a nonempty queue may lead to an empty queue stands between us and what we want to do. A “may lead” is translated to the return type since it is one of the possible results in runtime.

We could try to do something crazy. What if Queue contains the amount of elements in its type. We know that if you:

It seems reasonable (and a bit crazy).

In a static typed language the compiler will use the types of the expressions. It won’t, since it can’t, rely on their actual values because those will only appear during execution. Programmers can deduce a certain behaviour of a program that holds for all possible executions, but this analysis goes beyond just the types of the expressions. This is something hard to realise if you are only used to dynamic languages. What we are trying to do is to take a bit more of information from the queues and add it to the Queue(N) type, so that the compiler will be able to use the information of the types to decide if the program does actually makes sense and hence compiles.

For this we will need:

Even if we added this, it’s a risky business: Let’s imagine we want a #filter operation that will remove from the queue all elements that are equal to a certain value. What will be the return type of Queue(N)#filter(e)?

Potentially Queue(N) | Queue(N-1) | Queue(N-2) | .... | Queue(1) | EmptyQueue. Or even Queue(N-R) where R=#count of e in self. See? scary 👻.

Types are powerful stuff.

There is a lot of theory behind type systems. This story grazes Dependent types system.

Types group values and serve as a basis for static analysis. Analysis that won’t depend on the actual values.

If the type goes into too much detail, then it will be a pain to use. It the type ignores lots of details, then we will get lots of exceptions in runtime.

This situation has been coming up all the time since we began designing Crystal’s type system and API. We aim to define a useful type system that will help programmers catch runtime exceptions, but we want it to be usable and easy-going. That is why:

Although the Queue story didn’t end up so well, it did end up well for all of the above types, allowing us to have a happy time while crystalling. Success!

comments powered by Disqus