loop
Often, correctness requires us to reason about the set of elements represented inside a data structure, or manipulated by a function. Examples of this abound: for example, we’d like to know that:
sorting routines return permutations of their inputs – i.e. return collections whose elements are the same as the input set,
resource management functions do not inadvertently create duplicate elements or drop elements from set of tracked resources.
syntax-tree manipulating procedures create well-scoped trees where the set of used variables are contained within the set of variables previously defined.
SMT Solvers support very expressive logics. In addition to linear arithmetic and uninterpreted functions, they can efficiently decide formulas over sets. Next, lets see how LiquidHaskell lets us exploit this fact to develop types and interfaces that guarantee invariants over the set of elements of a structures.
First, we need a way to talk about sets in the refinement logic. We could roll our own special Haskell type but for now, lets just use the Set a
type from the prelude’s Data.Set
.1
LiquidHaskell Lifts the basic set operators from Data.Set
into the refinement logic. That is, the prelude defines the following logical functions that correspond to the Haskell functions of the same name:
measure empty :: Set a
measure singleton :: a -> Set a
measure member :: a -> Set a -> Bool
measure union :: Set a -> Set a -> Set a
measure intersection :: Set a -> Set a -> Set a
measure difference :: Set a -> Set a -> Set a
The above operators are interpreted by the SMT solver. That is, just like the SMT solver “knows”, via the axioms of the theory of arithmetic that: \[x = 1 + 1 \Rightarrow x = 2\] is a valid formula, i.e. holds for all \(x\), the solver “knows” that: \[x = \tsng{1} \Rightarrow y = \tsng{2} \Rightarrow x = \tcap{x}{\tcup{y}{x}}\] This is because, the above formulas belong to a decidable Theory of Sets reduces to McCarthy’s more general Theory of Arrays. 2
To get the hang of whats going on, lets do a few warm up exercises, using LiquidHaskell to prove various simple theorems about sets and operations over them.
We Refine The Set API to make it easy to write down theorems. That is, we give the operators in Data.Set
refinement type signatures that precisely track their set-theoretic behavior:
empty :: {v:Set a | v = empty}
member :: x:a
-> s:Set a
-> {v:Bool | v <=> member x s}
singleton :: x:a -> {v:Set a | v = singleton x}
union :: x:Set a
-> y:Set a
-> {v:Set a | v = union x y}
intersection :: x:Set a
-> y:Set a
-> {v:Set a | v = intersection x y}
difference :: x:Set a
-> y:Set a
-> {v:Set a | v = difference x y}
We Can Assert Theorems as QuickCheck style properties, that is, as functions from arbitrary inputs to a Bool
output that must always be True
. Lets define aliases for the Bool
eans that are always True
or False
We can use True
to state theorems. For example, the unexciting arithmetic equality above becomes:
Where implies
is just the implication function over Bool
and Implies p q
is defined as
Exercise: (Bounded Addition): Write and prove a QuickCheck style theorem that: \(\forall x, y. x < 100 \wedge y < 100 \Rightarrow x + y < 200\).
The Commutativity of Intersection can be easily stated and proved as a QuickCheck style theorem:
The Associativity of Union can similarly be confirmed:
The Distributivity Laws for Boolean Algebra can be verified by writing properties over the relevant operators. For example, we lets check that intersection
distributes over union
:
Non-Theorems should be rejected. So, while we’re at it, let’s make sure LiquidHaskell doesn’t prove anything that isn’t true …
Exercise: (Set Difference): Why does the above property fail?
Use QuickCheck (or your own little grey cells) to find a counterexample for the property prop_cup_dif_bad
.
Use the counterexample to assign pre
a non-trivial (i.e. other than False
) condition so that the property can be proved.
Thus, LiquidHaskell’s refined types offer a nice interface for interacting with the SMT solvers in order to prove theorems, while letting us use QuickCheck to generate counterexamples.3
Lets return to our real goal, which is to verify properties of programs. First, we need a way to refine the list API to precisely track the set of elements in a list.
The Elements of a List can be described by a simple recursive measure that walks over the list, building up the set:
Lets write a few helpful aliases for various refined lists that will then make the subsequent specifications pithy and crisp.
S
X
X
X
and Y
X
and the contents of Y
The Measures strengthens the data constructors for lists. That is we get the automatically refined types for “nil” and “cons”:
data [a] where
[] :: ListEmp a
(:) :: x:a -> xs:[a] -> ListUn1 a x xs
Lets take our new vocabulary out for a spin!
The Append function returns a list whose elements are the union of the elements of the input Lists:
Exercise: (Reverse): Write down a type for revHelper
so that reverse'
is verified by LiquidHaskell.
Exercise: (Halve): Write down a specification for halve
such that the subsequent “theorem” prop_halve_append
is proved by LiquidHaskell.
Hint: You may want to remind yourself about the dimension-aware signature for partition
from the earlier chapter.
Exercise: (Membership): Write down a signature for elem
that suffices to verify test1
and test2
.
Next, lets use the refined list API to prove that various sorting routines return permutations of their inputs, that is, return output lists whose elements are the same as those of the input lists.4
Insertion Sort is the simplest of all the list sorting routines; we build up an (ordered) output list insert
ing each element of the input list into the appropriate position of the output:
Thus, the output of insert
has all the elements of the input xs
, plus the new element x
:
The above signature lets us prove that the output of the sorting routine indeed has the elements of the input:
Exercise: (Merge): Fix the specification of merge
so that the subsequent property prop_merge_app
is verified by LiquidHaskell.
Exercise: (Merge Sort): Once you write the correct type for merge
above, you should be able to prove the unexpected signature for mergeSort
below.
Make sure you are able verify the given signature.
Obviously we don’t want mergeSort
to return the empty list, so there’s a bug. Find and fix it, so that you cannot prove that the output is empty, but can instead prove that the output is ListEq a xs
.
Often, we want to enforce the invariant that a particular collection contains no duplicates; as multiple copies in a collection of file handles or system resources can create unpleasant leaks. For example, the xmonad window manager creates a sophisticated zipper data structure to hold the list of active user windows and carefully maintains the invariant that that the zipper contains no duplicates. Next, lets see how to specify and verify this invariant using LiquidHaskell, first for lists, and then for a simplified zipper.
To Specify Uniqueness we need a way of saying that a list has no duplicates. There are many ways to do so; the simplest is a measure:
We can use the above to write an alias for duplicate-free lists
Lets quickly check that the right lists are indeed unique
The Filter function returns a subset of its elements, and hence, preserves uniqueness. That is, if the input is unique, the output is too:
Exercise: (Filter): It seems a bit draconian to require that filter
only be called with unique lists. Write down a more permissive type for filter'
below such that the subsequent uses are verified by LiquidHaskell.
Exercise: (Reverse): When we reverse
their order, the set of elements is unchanged, and hence unique (if the input was unique). Why does LiquidHaskell reject the below? Can you fix things so that we can prove that the output is a UList a
? (When you are done, you should be able to remove the assume
from the signature below, and still have LH verify the code.)
The Nub function constructs a unique
list from an arbitrary input by traversing the input and tossing out elements that are already seen
:
The key membership test is done by isin
, whose output is True
exactly when the element is in the given list. 5
Exercise: (Append): Why does append
ing two UList
s not return a UList
? Fix the type signature below so that you can prove that the output is indeed unique
.
Exercise: (Range): range i j
returns the list of Int
between i
and j
. LiquidHaskell refuses to acknowledge that the output is indeed a UList
. Fix the code so that LiquidHaskell verifies that it implements the given signature (and of course, computes the same result.)
Hint: This may be easier to do after you read this chapter about lemmas.
A zipper is an aggregate data structure that is used to arbitrarily traverse the structure and update its contents. For example, a zipper for a list is a data type that contains an element (called focus
) that we are currently focus
-ed on, a list of elements to the left
of (i.e. before) the focus, and a list of elements to the right
(i.e. after) the focus.
xmonad is a wonderful tiling window manager, that uses a zipper to store the set of windows being managed. xmonad requires the crucial invariant that the values in the zipper be unique, that is, be free of duplicates.
We Refine Zipper to capture the requirement that legal zippers are unique. To this end, we state that the left
and right
lists are unique, disjoint, and do not contain focus
.
Our Refined Zipper Constructor makes illegal states unrepresentable. That is, by construction, we will ensure that every Zipper
is free of duplicates. For example, it is straightforward to create a valid Zipper
from a unique
list:
Exercise: (Deconstructing Zippers): Dually, the elements of a unique zipper tumble out into a unique list. Strengthen the types of reverse
and append
above so that LiquidHaskell accepts the below signatures for integrate
:
We can Shift the Focus element to the left or right while preserving the uniqueness invariant. Here’s the code that shifts the focus to the left:
To shift to the right, we simply reverse the elements and shift to the left:
To Filter elements from a zipper, we need to take care when the focus
itself, or all the elements get eliminated. In the latter case, there is no Zipper
and so the operation returns a Maybe
:
Thus, by using LiquidHaskell’s refinement types, and the SMT solvers native reasoning about sets, we can ensure the key uniqueness invariant holds in the presence of various tricky operations that are performed over Zipper
s.
In this chapter, we saw how SMT solvers can let us reason precisely about the actual contents of data structures, via the theory of sets. In particular, we saw how to:
Lift set-theoretic primitives to refined Haskell functions from the Data.Set
library,
Define measures like elts
that characterize the set of elements of structures, and unique
that describe high-level application specific properties about those sets,
Specify and verify that implementations enjoy various functional correctness properties, e.g. that sorting routines return permutations of their inputs, and various zipper operators preserve uniqueness.
Next, we present a variety of longer case-studies that illustrate the techniques developed so far on particular application domains.
See this for a brief description of how to work directly with the set operators natively supported by LiquidHaskell.↩︎
See this recent paper to learn how modern SMT solvers prove equalities like the above.↩︎
The SBV and Leon projects describe a different DSL based approach for using SMT solvers from Haskell and Scala respectively.↩︎
Since we are focusing on the elements, lets not distract ourselves with the ordering invariant and reuse plain old lists. See this for how to specify and verify order with plain old lists.↩︎
Which should be clear by now, if you did a certain exercise above .↩︎