loop
Often, correctness requires us to reason about the set of elements represented inside a data structure, or manipulated by a function. Examples of this abound: for example, we’d like to know that:
sorting routines return permutations of their inputs – i.e. return collections whose elements are the same as the input set,
resource management functions do not inadvertently create duplicate elements or drop elements from set of tracked resources.
syntax-tree manipulating procedures create well-scoped trees where the set of used variables are contained within the set of variables previously defined.
SMT Solvers support very expressive logics. In addition
to linear arithmetic and uninterpreted functions, they can efficiently decide
formulas over sets. Next, let’s see how LiquidHaskell lets us exploit
this fact to develop types and interfaces that guarantee invariants over
the set of elements of a structures.
First, we need a way to talk about sets in the refinement logic. We
could roll our own special Haskell type but for now, let’s just use the
Set a
type from the prelude’s Data.Set
.1
LiquidHaskell Lifts the basic set operators from
Data.Set
into the refinement logic. That is, the prelude
defines the following logical functions that correspond to the
Haskell functions of the same name:
measure empty :: Set a
measure singleton :: a -> Set a
measure member :: a -> Set a -> Bool
measure union :: Set a -> Set a -> Set a
measure intersection :: Set a -> Set a -> Set a
measure difference :: Set a -> Set a -> Set a
The above operators are interpreted by the SMT solver. That is, just like the SMT solver “knows”, via the axioms of the theory of arithmetic that: \[x = 1 + 1 \Rightarrow x = 2\] is a valid formula, i.e. holds for all \(x\), the solver “knows” that: \[x = \tsng{1} \Rightarrow y = \tsng{2} \Rightarrow x = \tcap{x}{\tcup{y}{x}}\] This is because, the above formulas belong to a decidable Theory of Sets reduces to McCarthy’s more general Theory of Arrays. 2
To get the hang of whats going on, let’s do a few warm up exercises, using LiquidHaskell to prove various simple theorems about sets and operations over them.
We Refine The Set API to make it easy to write down
theorems. That is, we give the operators in Data.Set
refinement type signatures that precisely track their set-theoretic
behavior:
empty :: {v:Set a | v = empty}
member :: x:a
-> s:Set a
-> {v:Bool | v <=> member x s}
singleton :: x:a -> {v:Set a | v = singleton x}
union :: x:Set a
-> y:Set a
-> {v:Set a | v = union x y}
intersection :: x:Set a
-> y:Set a
-> {v:Set a | v = intersection x y}
difference :: x:Set a
-> y:Set a
-> {v:Set a | v = difference x y}
We Can Assert Theorems as QuickCheck style properties, that is, as
functions from arbitrary inputs to a Bool
output that must
always be True
. Lets define aliases for the
Bool
eans that are always True
or
False
We can use True
to state theorems. For example, the
unexciting arithmetic equality above becomes:
Where implies
is just the implication function over
Bool
and Implies p q
is defined as
Exercise: (Bounded Addition): Write and prove a
QuickCheck style theorem that: \(\forall x, y.
x < 100 \wedge y < 100 \Rightarrow x + y < 200\).
The Commutativity of Intersection can be easily stated
and proved as a QuickCheck style theorem:
The Associativity of Union can similarly be
confirmed:
The Distributivity Laws for Boolean Algebra can be
verified by writing properties over the relevant operators. For example,
let’s check that intersection
distributes over
union
:
Non-Theorems should be rejected. So, while we’re at it,
let’s make sure LiquidHaskell doesn’t prove anything that isn’t
true …
Exercise: (Set Difference): Why does the above property
fail?
Use QuickCheck (or your own little grey cells) to find a
counterexample for the property
prop_cup_dif_bad
.
Use the counterexample to assign pre
a non-trivial
(i.e. other than False
) condition so that the property can
be proved.
Thus, LiquidHaskell’s refined types offer a nice interface for interacting with the SMT solvers in order to prove theorems, while letting us use QuickCheck to generate counterexamples.3
Lets return to our real goal, which is to verify properties of programs. First, we need a way to refine the list API to precisely track the set of elements in a list.
The Elements of a List can be described by a simple
recursive measure that walks over the list, building up the set:
Lets write a few helpful aliases for various refined lists that will then make the subsequent specifications pithy and crisp.
S
X
X
X
and
Y
X
and the contents of
Y
The Measures strengthens the data constructors for
lists. That is we get the automatically refined types for “nil” and
“cons”:
data [a] where
[] :: ListEmp a
(:) :: x:a -> xs:[a] -> ListUn1 a x xs
Lets take our new vocabulary out for a spin!
The Append function returns a list whose elements are
the union of the elements of the input Lists:
Exercise: (Reverse): Write down a type for
revHelper
so that reverse'
is verified by
LiquidHaskell.
Exercise: (Halve): Write down a specification for
halve
such that the subsequent “theorem”
prop_halve_append
is proved by LiquidHaskell.
Hint: You may want to remind yourself about the
dimension-aware signature for partition
from the
earlier chapter.
Exercise: (Membership): Write down a signature for
elem
that suffices to verify test1
and
test2
.
Next, let’s use the refined list API to prove that various sorting routines return permutations of their inputs, that is, return output lists whose elements are the same as those of the input lists.4
Insertion Sort is the simplest of all the list sorting
routines; we build up an (ordered) output list insert
ing
each element of the input list into the appropriate position of the
output:
Thus, the output of insert
has all the elements of the
input xs
, plus the new element x
:
The above signature lets us prove that the output of the sorting routine indeed has the elements of the input:
Exercise: (Merge): Fix the specification of
merge
so that the subsequent property
prop_merge_app
is verified by LiquidHaskell.
Exercise: (Merge Sort): Once you write the correct type
for merge
above, you should be able to prove the unexpected
signature for mergeSort
below.
Make sure you are able verify the given signature.
Obviously we don’t want mergeSort
to return the
empty list, so there’s a bug. Find and fix it, so that you
cannot prove that the output is empty, but can instead
prove that the output is ListEq a xs
.
Often, we want to enforce the invariant that a particular collection contains no duplicates; as multiple copies in a collection of file handles or system resources can create unpleasant leaks. For example, the xmonad window manager creates a sophisticated zipper data structure to hold the list of active user windows and carefully maintains the invariant that that the zipper contains no duplicates. Next, let’s see how to specify and verify this invariant using LiquidHaskell, first for lists, and then for a simplified zipper.
To Specify Uniqueness we need a way of saying that a
list has no duplicates. There are many ways to do so; the
simplest is a measure:
We can use the above to write an alias for duplicate-free lists
Lets quickly check that the right lists are indeed
unique
The Filter function returns a subset of its
elements, and hence, preserves uniqueness. That is, if the
input is unique, the output is too:
Exercise: (Filter): It seems a bit draconian to require
that filter
only be called with unique lists. Write down a
more permissive type for filter'
below such that the
subsequent uses are verified by LiquidHaskell.
Exercise: (Reverse): When we reverse
their
order, the set of elements is unchanged, and hence unique (if the input
was unique). Why does LiquidHaskell reject the below? Can you fix things
so that we can prove that the output is a UList a
? (When
you are done, you should be able to remove the assume
from
the signature below, and still have LH verify the code.)
The Nub function constructs a unique
list
from an arbitrary input by traversing the input and tossing out elements
that are already seen
:
The key membership test is done by isin
, whose output is
True
exactly when the element is in the given list. 5
Exercise: (Append): Why does append
ing two
UList
s not return a UList
? Fix the type
signature below so that you can prove that the output is indeed
unique
.
Exercise: (Range): range i j
returns the
list of Int
between i
and j
.
LiquidHaskell refuses to acknowledge that the output is indeed a
UList
. Fix the code so that LiquidHaskell verifies that it
implements the given signature (and of course, computes the same
result.)
Hint: This may be easier to do after you read this chapter about lemmas.
A zipper is an aggregate data structure
that is used to arbitrarily traverse the structure and update its
contents. For example, a zipper for a list is a data type that contains
an element (called focus
) that we are currently
focus
-ed on, a list of elements to the left
of
(i.e. before) the focus, and a list of elements to the
right
(i.e. after) the focus.
xmonad is a wonderful tiling window manager, that uses
a zipper
to store the set of windows being managed. xmonad requires the crucial
invariant that the values in the zipper be unique, that is, be free of
duplicates.
We Refine Zipper to capture the requirement that legal
zippers are unique. To this end, we state that the left
and
right
lists are unique, disjoint, and do not contain
focus
.
Our Refined Zipper Constructor makes illegal states
unrepresentable. That is, by construction, we will ensure that
every Zipper
is free of duplicates. For example, it is
straightforward to create a valid Zipper
from a
unique
list:
Exercise: (Deconstructing Zippers): Dually, the
elements of a unique zipper tumble out into a unique list. Strengthen
the types of reverse
and append
above so that
LiquidHaskell accepts the below signatures for
integrate
:
We can Shift the Focus element to the left or right
while preserving the uniqueness invariant. Here’s the code that shifts
the focus to the left:
To shift to the right, we simply reverse the elements and shift to the left:
To Filter elements from a zipper, we need to take care
when the focus
itself, or all the elements get eliminated.
In the latter case, there is no Zipper
and so the operation
returns a Maybe
:
Thus, by using LiquidHaskell’s refinement types, and the SMT solvers
native reasoning about sets, we can ensure the key uniqueness invariant
holds in the presence of various tricky operations that are performed
over Zipper
s.
In this chapter, we saw how SMT solvers can let us reason precisely about the actual contents of data structures, via the theory of sets. In particular, we saw how to:
Lift set-theoretic primitives to refined Haskell
functions from the Data.Set
library,
Define measures like elts
that characterize
the set of elements of structures, and unique
that describe
high-level application specific properties about those sets,
Specify and verify that implementations enjoy various functional correctness properties, e.g. that sorting routines return permutations of their inputs, and various zipper operators preserve uniqueness.
Next, we present a variety of longer case-studies that illustrate the techniques developed so far on particular application domains.
See this for a brief description of how to work directly with the set operators natively supported by LiquidHaskell.↩︎
See this recent paper to learn how modern SMT solvers prove equalities like the above.↩︎
The SBV and Leon projects describe a different DSL based approach for using SMT solvers from Haskell and Scala respectively.↩︎
Since we are focusing on the elements, let’s not distract ourselves with the ordering invariant and reuse plain old lists. See this for how to specify and verify order with plain old lists.↩︎
Which should be clear by now, if you did a certain exercise above .↩︎