In the last two chapters, we saw how refinements could be used to reason about the properties of basic
Int values like vector indices, or the elements of a list. Next, lets see how we can describe properties of aggregate structures like lists and trees, and use these properties to improve the APIs for operating over such structures.
As a motivating example, let us return to the problem of ensuring the safety of division. Recall that we wrote:
The Precondition asserted by the input type
NonZero allows LiquidHaskell to prove that the
die is never executed at run-time, but consequently, requires us to establish that wherever
divide is used, the second parameter be provably non-zero. This requirement is not onerous when we know what the divisor is statically
However, it can be more of a challenge when the divisor is obtained dynamically. For example, lets write a function to find the number of elements in a list
and use it to compute the average value of a list:
Uh oh. LiquidHaskell wags its finger at us!
src/04-measure.lhs:77:27-31: Error: Liquid Type Mismatch Inferred type VV : Int | VV == elems not a subtype of Required type VV : Int | 0 /= VV In Context VV : Int | VV == elems elems : Int
We cannot prove that the divisor is
NonZero, because it can be
0 – when the list is empty. Thus, we need a way of specifying that the input to
avgMany is indeed non-empty!
How shall we tell LiquidHaskell that a list is non-empty? Recall the notion of
measure previously introduced to describe the size of a
Data.Vector. In that spirit, lets write a function that computes whether a list is not empty:
A measure is a total Haskell function,
We can tell LiquidHaskell to lift a function meeting the above requirements into the refinement logic by declaring:
Non-Empty Lists can now be described as the subset of plain old Haskell lists
[a] for which the predicate
We can now refine various signatures to establish the safety of the list-average function.
Size returns a non-zero value if the input list is not-empty. We capture this condition with an implication in the output refinement.
Average is only sensible for non-empty lists. Happily, we can specify this using the refined
Exercise: (Average, Maybe): Fix the code below to obtain an alternate variant
average' that returns
Nothing for empty lists:
Exercise: (Debugging Specifications): An important aspect of formal verifiers like LiquidHaskell is that they help establish properties not just of your implementations but equally, or more importantly, of your specifications. In that spirit, can you explain why the following two variants of
size are rejected by LiquidHaskell?
Now that we can talk about non-empty lists, we can ensure the safety of various list-manipulating functions which are only well-defined on non-empty lists and crash otherwise.
Head and Tail are two of the canonical dangerous functions, that only work on non-empty lists, and burn horribly otherwise. We can type them simple as:
LiquidHaskell uses the precondition to deduce that the second equations are dead code. Of course, this requires us to establish that callers of
tail only invoke the respective functions with non-empty lists.
Exercise: (Safe Head): Write down a specification for
null such that
safeHead is verified. Do not force
null to only take non-empty inputs, that defeats the purpose. Instead, its type should say that it works on all lists and returns
True if and only if the input is non-empty.
Hint: You may want to refresh your memory about implies
<=> from the chapter on logic.
Lets use the above to write a function that chunks sequences into non-empty groups of equal elements:
By using the fact that each element in the output returned by
groupEq is in fact of the form
x:ys, LiquidHaskell infers that
groupEq returns a
[NEList a] that is, a list of non-empty lists.
To Eliminate Stuttering from a string, we can use
groupEq to split the string into blocks of repeating
Chars, and then just extract the first
Char from each block:
LiquidHaskell automatically instantiates the type parameter for
notEmpty v to deduce that
head is only called on non-empty lists.
Foldl1 is one of my favorite folds; it uses the first element of the sequence as the initial value. Of course, it should only be called with non-empty sequences!
To Sum a non-empty list of numbers, we can just perform a
foldl1 with the
+ operator: Thanks to the precondition, LiquidHaskell will prove that the
die code is indeed dead. Thus, we can write
Consequently, we can only invoke
sum on non-empty lists, so:
Exercise: (Weighted Average): The function below computes a weighted average of its input. Unfortunately, LiquidHaskell is not very happy about it. Can you figure out why, and fix the code or specification appropriately?
Hint: On what variables are the errors? How are those variables’ values computed? Can you think of a better specification for the function(s) doing those computations?
Exercise: (Mitchell's Risers): Non-empty lists pop up in many places, and it is rather convenient to have the type system track non-emptiness without having to make up special types. Consider the
risers function, popularized by Neil Mitchell.
safeSplit requires its input be non-empty; but LiquidHaskell believes that the call inside
risers fails this requirement. Fix the specification for
risers so that it is verified.
In this chapter we saw how LiquidHaskell lets you
Define structural properties of data types,
Use refinements over these properties to describe key invariants that establish, at compile-time, the safety of operations that might otherwise fail on unexpected values at run-time, all while,
Working with plain Haskell types, here, Lists, without having to make up new types which can have the unfortunate effect of adding a multitude of constructors and conversions which often clutter implementations and specifications.
Of course, we can do a lot more with measures, so lets press on!