Elemental Measures

Often, correctness requires us to reason about the set of elements represented inside a data structure, or manipulated by a function. Examples of this abound: for example, we’d like to know that:

  • sorting routines return permutations of their inputs – i.e. return collections whose elements are the same as the input set,

  • resource management functions do not inadvertently create duplicate elements or drop elements from set of tracked resources.

  • syntax-tree manipulating procedures create well-scoped trees where the set of used variables are contained within the set of variables previously defined.


SMT Solvers support very expressive logics. In addition to linear arithmetic and uninterpreted functions, they can efficiently decide formulas over sets. Next, let’s see how LiquidHaskell lets us exploit this fact to develop types and interfaces that guarantee invariants over the set of elements of a structures.

Talking about Sets

First, we need a way to talk about sets in the refinement logic. We could roll our own special Haskell type but for now, let’s just use the Set a type from the prelude’s Data.Set.1


LiquidHaskell Lifts the basic set operators from Data.Set into the refinement logic. That is, the prelude defines the following logical functions that correspond to the Haskell functions of the same name:

measure empty        :: Set a
measure singleton    :: a -> Set a
measure member       :: a -> Set a -> Bool
measure union        :: Set a -> Set a -> Set a
measure intersection :: Set a -> Set a -> Set a
measure difference   :: Set a -> Set a -> Set a

The above operators are interpreted by the SMT solver. That is, just like the SMT solver “knows”, via the axioms of the theory of arithmetic that: \[x = 1 + 1 \Rightarrow x = 2\] is a valid formula, i.e. holds for all \(x\), the solver “knows” that: \[x = \tsng{1} \Rightarrow y = \tsng{2} \Rightarrow x = \tcap{x}{\tcup{y}{x}}\] This is because, the above formulas belong to a decidable Theory of Sets reduces to McCarthy’s more general Theory of Arrays. 2

Proving QuickCheck Style Properties

To get the hang of whats going on, let’s do a few warm up exercises, using LiquidHaskell to prove various simple theorems about sets and operations over them.


We Refine The Set API to make it easy to write down theorems. That is, we give the operators in Data.Set refinement type signatures that precisely track their set-theoretic behavior:

empty        :: {v:Set a | v = empty}
member       :: x:a
             -> s:Set a
             -> {v:Bool | v <=> member x s}

singleton    :: x:a -> {v:Set a | v = singleton x}

union        :: x:Set a
             -> y:Set a
             -> {v:Set a | v = union x y}

intersection :: x:Set a
             -> y:Set a
             -> {v:Set a | v = intersection x y}

difference   :: x:Set a
             -> y:Set a
             -> {v:Set a | v = difference x y}


We Can Assert Theorems as QuickCheck style properties, that is, as functions from arbitrary inputs to a Bool output that must always be True. Lets define aliases for the Booleans that are always True or False

{-@ type True = {v:Bool | v} @-} {-@ type False = {v:Bool | not v} @-}

We can use True to state theorems. For example, the unexciting arithmetic equality above becomes:

{-@ prop_one_plus_one_eq_two :: _ -> True @-} prop_one_plus_one_eq_two x = (x == 1 + 1) `implies` (x == 2)

Where implies is just the implication function over Bool

{-@ implies :: p:Bool -> q:Bool -> Implies p q @-} implies False _ = True implies _ True = True implies _ _ = False

and Implies p q is defined as

{-@ type Implies P Q = {v:_ | v <=> (P => Q)} @-}


Exercise: (Bounded Addition): Write and prove a QuickCheck style theorem that: \(\forall x, y. x < 100 \wedge y < 100 \Rightarrow x + y < 200\).



{-@ prop_x_y_200 :: _ -> _ -> True @-} prop_x_y_200 x y = False -- fill in the theorem body


The Commutativity of Intersection can be easily stated and proved as a QuickCheck style theorem:

{-@ prop_intersection_comm :: _ -> _ -> True @-} prop_intersection_comm x y = (x `intersection` y) == (y `intersection` x)


The Associativity of Union can similarly be confirmed:

{-@ prop_union_assoc :: _ -> _ -> _ -> True @-} prop_union_assoc x y z = (x `union` (y `union` z)) == (x `union` y) `union` z


The Distributivity Laws for Boolean Algebra can be verified by writing properties over the relevant operators. For example, let’s check that intersection distributes over union:

{-@ prop_intersection_dist :: _ -> _ -> _ -> True @-} prop_intersection_dist x y z = x `intersection` (y `union` z) == (x `intersection` y) `union` (x `intersection` z)


Non-Theorems should be rejected. So, while we’re at it, let’s make sure LiquidHaskell doesn’t prove anything that isn’t true …

{-@ prop_cup_dif_bad :: _ -> _ -> True @-} prop_cup_dif_bad x y = pre `implies` (x == ((x `union` y) `difference` y)) where pre = True -- Fix this with a non-trivial precondition


Exercise: (Set Difference): Why does the above property fail?

  1. Use QuickCheck (or your own little grey cells) to find a counterexample for the property prop_cup_dif_bad.

  2. Use the counterexample to assign pre a non-trivial (i.e. other than False) condition so that the property can be proved.



Thus, LiquidHaskell’s refined types offer a nice interface for interacting with the SMT solvers in order to prove theorems, while letting us use QuickCheck to generate counterexamples.3

Content-Aware List API

Lets return to our real goal, which is to verify properties of programs. First, we need a way to refine the list API to precisely track the set of elements in a list.


The Elements of a List can be described by a simple recursive measure that walks over the list, building up the set:

{-@ measure elts @-} elts :: (Ord a) => [a] -> Set a elts [] = empty elts (x:xs) = singleton x `union` elts xs

Lets write a few helpful aliases for various refined lists that will then make the subsequent specifications pithy and crisp.

  • A list with elements S
{-@ type ListS a S = {v:[a] | elts v = S} @-}
  • An empty list
{-@ type ListEmp a = ListS a {Set_empty 0} @-}
  • A list whose contents equal those of list X
{-@ type ListEq a X = ListS a {elts X} @-}
  • A list whose contents are a subset of list X
{-@ type ListSub a X = {v:[a]| Set_sub (elts v) (elts X)} @-}
  • A list whose contents are the union of lists X and Y
{-@ type ListUn a X Y = ListS a {Set_cup (elts X) (elts Y)} @-}
  • A list whose contents are exactly X and the contents of Y
{-@ type ListUn1 a X Y = ListS a {Set_cup (Set_sng X) (elts Y)} @-}


The Measures strengthens the data constructors for lists. That is we get the automatically refined types for “nil” and “cons”:

data [a] where
  []  :: ListEmp a
  (:) :: x:a -> xs:[a] -> ListUn1 a x xs

Lets take our new vocabulary out for a spin!


The Append function returns a list whose elements are the union of the elements of the input Lists:

{-@ append' :: xs:_ -> ys:_ -> ListUn a xs ys @-} append' [] ys = ys append' (x:xs) ys = x : append' xs ys


Exercise: (Reverse): Write down a type for revHelper so that reverse' is verified by LiquidHaskell.



{-@ reverse' :: xs:[a] -> ListEq a xs @-} reverse' xs = revHelper [] xs revHelper acc [] = acc revHelper acc (x:xs) = revHelper (x:acc) xs


Exercise: (Halve): Write down a specification for halve such that the subsequent “theorem” prop_halve_append is proved by LiquidHaskell.



halve :: Int -> [a] -> ([a], [a]) halve 0 xs = ([], xs) halve n (x:y:zs) = (x:xs, y:ys) where (xs, ys) = halve (n-1) zs halve _ xs = ([], xs) {-@ prop_halve_append :: _ -> _ -> True @-} prop_halve_append n xs = elts xs == elts xs' where xs' = append' ys zs (ys, zs) = halve n xs

Hint: You may want to remind yourself about the dimension-aware signature for partition from the earlier chapter.


Exercise: (Membership): Write down a signature for elem that suffices to verify test1 and test2.



{-@ elem :: (Eq a) => a -> [a] -> Bool @-} elem _ [] = False elem x (y:ys) = x == y || elem x ys {-@ test1 :: True @-} test1 = elem 2 [1, 2, 3] {-@ test2 :: False @-} test2 = elem 2 [1, 3]

Permutations

Next, let’s use the refined list API to prove that various sorting routines return permutations of their inputs, that is, return output lists whose elements are the same as those of the input lists.4


Insertion Sort is the simplest of all the list sorting routines; we build up an (ordered) output list inserting each element of the input list into the appropriate position of the output:

insert x [] = [x] insert x (y:ys) | x <= y = x : y : ys | otherwise = y : insert x ys

Thus, the output of insert has all the elements of the input xs, plus the new element x:

{-@ insert :: x:a -> xs:[a] -> ListUn1 a x xs @-}

The above signature lets us prove that the output of the sorting routine indeed has the elements of the input:

{-@ insertSort :: (Ord a) => xs:[a] -> ListEq a xs @-} insertSort [] = [] insertSort (x:xs) = insert x (insertSort xs)


Exercise: (Merge): Fix the specification of merge so that the subsequent property prop_merge_app is verified by LiquidHaskell.



{-@ merge :: xs:[a] -> ys:[a] -> [a] @-} merge [] ys = ys merge xs [] = xs merge (x:xs) (y:ys) | x <= y = x : merge xs (y:ys) | otherwise = y : merge (x:xs) ys {-@ prop_merge_app :: _ -> _ -> True @-} prop_merge_app xs ys = elts zs == elts zs' where zs = append' xs ys zs' = merge xs ys


Exercise: (Merge Sort): Once you write the correct type for merge above, you should be able to prove the unexpected signature for mergeSort below.

  1. Make sure you are able verify the given signature.

  2. Obviously we don’t want mergeSort to return the empty list, so there’s a bug. Find and fix it, so that you cannot prove that the output is empty, but can instead prove that the output is ListEq a xs.



{-@ mergeSort :: (Ord a) => xs:[a] -> ListEmp a @-} mergeSort [] = [] mergeSort xs = merge (mergeSort ys) (mergeSort zs) where (ys, zs) = halve mid xs mid = length xs `div` 2

Uniqueness

Often, we want to enforce the invariant that a particular collection contains no duplicates; as multiple copies in a collection of file handles or system resources can create unpleasant leaks. For example, the xmonad window manager creates a sophisticated zipper data structure to hold the list of active user windows and carefully maintains the invariant that that the zipper contains no duplicates. Next, let’s see how to specify and verify this invariant using LiquidHaskell, first for lists, and then for a simplified zipper.


To Specify Uniqueness we need a way of saying that a list has no duplicates. There are many ways to do so; the simplest is a measure:

{-@ measure unique @-} unique :: (Ord a) => [a] -> Bool unique [] = True unique (x:xs) = unique xs && not (member x (elts xs))

We can use the above to write an alias for duplicate-free lists

{-@ type UList a = {v:[a] | unique v }@-}

Lets quickly check that the right lists are indeed unique

{-@ isUnique :: UList Int @-} isUnique = [1, 2, 3] -- accepted by LH {-@ isNotUnique :: UList Int @-} isNotUnique = [1, 2, 3, 1] -- rejected by LH


The Filter function returns a subset of its elements, and hence, preserves uniqueness. That is, if the input is unique, the output is too:

{-@ filter :: (a -> Bool) -> xs:UList a -> {v:ListSub a xs | unique v} @-} filter _ [] = [] filter f (x:xs) | f x = x : xs' | otherwise = xs' where xs' = filter f xs


Exercise: (Filter): It seems a bit draconian to require that filter only be called with unique lists. Write down a more permissive type for filter' below such that the subsequent uses are verified by LiquidHaskell.



filter' _ [] = [] filter' f (x:xs) | f x = x : xs' | otherwise = xs' where xs' = filter' f xs {-@ test3 :: UList _ @-} test3 = filter' (> 2) [1,2,3,4] {-@ test4 :: [_] @-} test4 = filter' (> 3) [3,1,2,3]


Exercise: (Reverse): When we reverse their order, the set of elements is unchanged, and hence unique (if the input was unique). Why does LiquidHaskell reject the below? Can you fix things so that we can prove that the output is a UList a? (When you are done, you should be able to remove the assume from the signature below, and still have LH verify the code.)



{-@ assume reverse :: xs:UList a -> UList a @-} reverse :: [a] -> [a] reverse = go [] where {-@ go :: acc:[a] -> xs:[a] -> [a] @-} go acc [] = acc go acc (x:xs) = go (x:acc) xs


The Nub function constructs a unique list from an arbitrary input by traversing the input and tossing out elements that are already seen:

{-@ nub :: [a] -> UList a @-} nub xs = go [] xs where {-@ go :: UList a -> xs:[a] -> UList a / [len xs] @-} go seen [] = seen go seen (x:xs) | x `isin` seen = go seen xs | otherwise = go (x:seen) xs

The key membership test is done by isin, whose output is True exactly when the element is in the given list. 5

-- FIXME {-@ predicate In X Xs = Set_mem X (elts Xs) @-} {-@ isin :: x:_ -> ys:_ -> {v:Bool | v <=> In x ys }@-} isin x (y:ys) | x == y = True | otherwise = x `isin` ys isin _ [] = False


Exercise: (Append): Why does appending two ULists not return a UList? Fix the type signature below so that you can prove that the output is indeed unique.



{-@ append :: UList a -> UList a -> UList a @-} append [] ys = ys append (x:xs) ys = x : append xs ys


Exercise: (Range): range i j returns the list of Int between i and j. LiquidHaskell refuses to acknowledge that the output is indeed a UList. Fix the code so that LiquidHaskell verifies that it implements the given signature (and of course, computes the same result.)



{-@ type Btwn I J = {v:_ | I <= v && v < J} @-} {-@ range :: i:Int -> j:Int -> UList (Btwn i j) @-} range i j | i < j = i : range (i + 1) j | otherwise = []

Hint: This may be easier to do after you read this chapter about lemmas.

Unique Zippers

A zipper is an aggregate data structure that is used to arbitrarily traverse the structure and update its contents. For example, a zipper for a list is a data type that contains an element (called focus) that we are currently focus-ed on, a list of elements to the left of (i.e. before) the focus, and a list of elements to the right (i.e. after) the focus.

data Zipper a = Zipper { focus :: a , left :: [a] , right :: [a] }


xmonad is a wonderful tiling window manager, that uses a zipper to store the set of windows being managed. xmonad requires the crucial invariant that the values in the zipper be unique, that is, be free of duplicates.


We Refine Zipper to capture the requirement that legal zippers are unique. To this end, we state that the left and right lists are unique, disjoint, and do not contain focus.

{-@ data Zipper a = Zipper { focus :: a , left :: {v: UList a | not (In focus v)} , right :: {v: UList a | not (In focus v) && Disj v left } } @-} {-@ predicate Disj X Y = Disjoint (elts X) (elts Y) @-}


Our Refined Zipper Constructor makes illegal states unrepresentable. That is, by construction, we will ensure that every Zipper is free of duplicates. For example, it is straightforward to create a valid Zipper from a unique list:

{-@ differentiate :: UList a -> Maybe (Zipper a) @-} differentiate [] = Nothing differentiate (x:xs) = Just $ Zipper x [] xs


Exercise: (Deconstructing Zippers): Dually, the elements of a unique zipper tumble out into a unique list. Strengthen the types of reverse and append above so that LiquidHaskell accepts the below signatures for integrate:



{-@ integrate :: Zipper a -> UList a @-} integrate (Zipper x l r) = reverse l `append` (x : r)


We can Shift the Focus element to the left or right while preserving the uniqueness invariant. Here’s the code that shifts the focus to the left:

focusLeft :: Zipper a -> Zipper a focusLeft (Zipper t (l:ls) rs) = Zipper l ls (t:rs) focusLeft (Zipper t [] rs) = Zipper x xs [] where (x:xs) = reverse (t:rs)

To shift to the right, we simply reverse the elements and shift to the left:

focusRight :: Zipper a -> Zipper a focusRight = reverseZipper . focusLeft . reverseZipper reverseZipper :: Zipper a -> Zipper a reverseZipper (Zipper t ls rs) = Zipper t rs ls


To Filter elements from a zipper, we need to take care when the focus itself, or all the elements get eliminated. In the latter case, there is no Zipper and so the operation returns a Maybe:

filterZipper :: (a -> Bool) -> Zipper a -> Maybe (Zipper a) filterZipper p (Zipper f ls rs) = case filter p (f:rs) of f':rs' -> Just $ Zipper f' (filter p ls) rs' [] -> case filter p ls of f':ls' -> Just $ Zipper f' ls' [] [] -> Nothing

Thus, by using LiquidHaskell’s refinement types, and the SMT solvers native reasoning about sets, we can ensure the key uniqueness invariant holds in the presence of various tricky operations that are performed over Zippers.

Recap

In this chapter, we saw how SMT solvers can let us reason precisely about the actual contents of data structures, via the theory of sets. In particular, we saw how to:

  • Lift set-theoretic primitives to refined Haskell functions from the Data.Set library,

  • Define measures like elts that characterize the set of elements of structures, and unique that describe high-level application specific properties about those sets,

  • Specify and verify that implementations enjoy various functional correctness properties, e.g. that sorting routines return permutations of their inputs, and various zipper operators preserve uniqueness.

Next, we present a variety of longer case-studies that illustrate the techniques developed so far on particular application domains.


  1. See this for a brief description of how to work directly with the set operators natively supported by LiquidHaskell.↩︎

  2. See this recent paper to learn how modern SMT solvers prove equalities like the above.↩︎

  3. The SBV and Leon projects describe a different DSL based approach for using SMT solvers from Haskell and Scala respectively.↩︎

  4. Since we are focusing on the elements, let’s not distract ourselves with the ordering invariant and reuse plain old lists. See this for how to specify and verify order with plain old lists.↩︎

  5. Which should be clear by now, if you did a certain exercise above .↩︎