Boolean Measures

In the last two chapters, we saw how refinements could be used to reason about the properties of basic Int values like vector indices, or the elements of a list. Next, lets see how we can describe properties of aggregate structures like lists and trees, and use these properties to improve the APIs for operating over such structures.

Partial Functions

As a motivating example, let us return to the problem of ensuring the safety of division. Recall that we wrote:

{-@ divide :: Int -> NonZero -> Int @-} divide _ 0 = die "divide-by-zero" divide x n = x `div` n


The Precondition asserted by the input type NonZero allows LiquidHaskell to prove that the die is never executed at run-time, but consequently, requires us to establish that wherever divide is used, the second parameter be provably non-zero. This requirement is not onerous when we know what the divisor is statically

avg2 x y = divide (x + y) 2 avg3 x y z = divide (x + y + z) 3

However, it can be more of a challenge when the divisor is obtained dynamically. For example, lets write a function to find the number of elements in a list

size :: [a] -> Int size [] = 0 size (_:xs) = 1 + size xs

and use it to compute the average value of a list:

avgMany xs = divide total elems where total = sum xs elems = size xs

Uh oh. LiquidHaskell wags its finger at us!

     src/04-measure.lhs:77:27-31: Error: Liquid Type Mismatch
       Inferred type
         VV : Int | VV == elems
      
       not a subtype of Required type
         VV : Int | 0 /= VV
      
       In Context
         VV    : Int | VV == elems
         elems : Int


We cannot prove that the divisor is NonZero, because it can be 0 -- when the list is empty. Thus, we need a way of specifying that the input to avgMany is indeed non-empty!

Lifting Functions to Measures


How shall we tell LiquidHaskell that a list is non-empty? Recall the notion of measure previously introduced to describe the size of a Data.Vector. In that spirit, lets write a function that computes whether a list is not empty:

notEmpty :: [a] -> Bool notEmpty [] = False notEmpty (_:_) = True


A measure is a total Haskell function,

  1. With a single equation per data constructor, and
  2. Guaranteed to terminate, typically via structural recursion.

We can tell LiquidHaskell to lift a function meeting the above requirements into the refinement logic by declaring:

{-@ measure notEmpty @-}


Non-Empty Lists can now be described as the subset of plain old Haskell lists [a] for which the predicate notEmpty holds

{-@ type NEList a = {v:[a] | notEmpty v} @-}

We can now refine various signatures to establish the safety of the list-average function.


Size returns a non-zero value if the input list is not-empty. We capture this condition with an implication in the output refinement.

{-@ size :: xs:[a] -> {v:Nat | notEmpty xs => v > 0} @-}


Average is only sensible for non-empty lists. Happily, we can specify this using the refined NEList type:

{-@ average :: NEList Int -> Int @-} average xs = divide total elems where total = sum xs elems = size xs


Exercise: (Average, Maybe): Fix the code below to obtain an alternate variant average' that returns Nothing for empty lists:



average' :: [Int] -> Maybe Int average' xs | ok = Just $ divide (sum xs) elems | otherwise = Nothing where elems = size xs ok = elems > 0 -- What expression goes here?


Exercise: (Debugging Specifications): An important aspect of formal verifiers like LiquidHaskell is that they help establish properties not just of your implementations but equally, or more importantly, of your specifications. In that spirit, can you explain why the following two variants of size are rejected by LiquidHaskell?



{-@ size1 :: xs:NEList a -> Pos @-} size1 [] = 0 size1 (_:xs) = 1 + size1 xs {-@ size2 :: xs:[a] -> {v:Int | notEmpty xs => v > 0} @-} size2 [] = 0 size2 (_:xs) = 1 + size2 xs

A Safe List API

Now that we can talk about non-empty lists, we can ensure the safety of various list-manipulating functions which are only well-defined on non-empty lists and crash otherwise.


Head and Tail are two of the canonical dangerous functions, that only work on non-empty lists, and burn horribly otherwise. We can type them simple as:

{-@ head :: NEList a -> a @-} head (x:_) = x head [] = die "Fear not! 'twill ne'er come to pass" {-@ tail :: NEList a -> [a] @-} tail (_:xs) = xs tail [] = die "Relaxeth! this too shall ne'er be"

LiquidHaskell uses the precondition to deduce that the second equations are dead code. Of course, this requires us to establish that callers of head and tail only invoke the respective functions with non-empty lists.


Exercise: (Safe Head): Write down a specification for null such that safeHead is verified. Do not force null to only take non-empty inputs, that defeats the purpose. Instead, it's type should say that it works on all lists and returns True if and only if the input is non-empty.



Hint: You may want to refresh your memory about implies ==> and <=> from the chapter on logic.

safeHead :: [a] -> Maybe a safeHead xs | null xs = Nothing | otherwise = Just $ head xs {-@ null :: [a] -> Bool @-} null [] = True null (_:_) = False


Groups Lets use the above to write a function that chunks sequences into non-empty groups of equal elements:

{-@ groupEq :: (Eq a) => [a] -> [NEList a] @-} groupEq [] = [] groupEq (x:xs) = (x:ys) : groupEq zs where (ys, zs) = span (x ==) xs

By using the fact that each element in the output returned by groupEq is in fact of the form x:ys, LiquidHaskell infers that groupEq returns a [NEList a] that is, a list of non-empty lists.


To Eliminate Stuttering from a string, we can use groupEq to split the string into blocks of repeating Chars, and then just extract the first Char from each block:

-- >>> eliminateStutter "ssstringssss liiiiiike thisss" -- "strings like this" eliminateStutter xs = map head $ groupEq xs

LiquidHaskell automatically instantiates the type parameter for map in eliminateStutter to notEmpty v to deduce that head is only called on non-empty lists.


Foldr1 is one of my favorite folds; it uses the first element of the sequence as the initial value. Of course, it should only be called with non-empty sequences!

{-@ foldr1 :: (a -> a -> a) -> NEList a -> a @-} foldr1 f (x:xs) = foldr f x xs foldr1 _ [] = die "foldr1" foldr :: (a -> b -> b) -> b -> [a] -> b foldr _ acc [] = acc foldr f acc (x:xs) = f x (foldr f acc xs)


To Sum a non-empty list of numbers, we can just perform a foldr1 with the + operator: Thanks to the precondition, LiquidHaskell will prove that the die code is indeed dead. Thus, we can write

{-@ sum :: (Num a) => NEList a -> a @-} sum [] = die "cannot add up empty list" sum xs = foldr1 (+) xs

Consequently, we can only invoke sum on non-empty lists, so:

sumOk = sum [1,2,3,4,5] -- is accepted by LH, but sumBad = sum [] -- is rejected by LH


Exercise: (Weighted Average): The function below computes a weighted average of its input. Unfortunately, LiquidHaskell is not very happy about it. Can you figure out why, and fix the code or specification appropriately?



{-@ wtAverage :: NEList (Pos, Pos) -> Int @-} wtAverage wxs = divide totElems totWeight where elems = map (\(w, x) -> w * x) wxs weights = map (\(w, _) -> w ) wxs totElems = sum elems totWeight = sum weights sum = foldr1 (+) map :: (a -> b) -> [a] -> [b] map _ [] = [] map f (x:xs) = f x : map f xs

Hint: On what variables are the errors? How are those variables' values computed? Can you think of a better specification for the function(s) doing those computations?


Exercise: (Mitchell's Risers): Non-empty lists pop up in many places, and it is rather convenient to have the type system track non-emptiness without having to make up special types. Consider the risers function, popularized by Neil Mitchell. safeSplit requires its input be non-empty; but LiquidHaskell believes that the call inside risers fails this requirement. Fix the specification for risers so that it is verified.



{-@ risers :: (Ord a) => xs:[a] -> {v: [[a]] | notEmpty xs => notEmpty v} @-} risers :: (Ord a) => [a] -> [[a]] risers [] = [] risers [x] = [[x]] risers (x:y:etc) | x <= y = (x:s) : ss | otherwise = [x] : (s : ss) where (s, ss) = safeSplit $ risers (y:etc) {-@ safeSplit :: NEList a -> (a, [a]) @-} safeSplit (x:xs) = (x, xs) safeSplit _ = die "don't worry, be happy"

Recap

In this chapter we saw how LiquidHaskell lets you

  1. Define structural properties of data types,

  2. Use refinements over these properties to describe key invariants that establish, at compile-time, the safety of operations that might otherwise fail on unexpected values at run-time, all while,

  3. Working with plain Haskell types, here, Lists, without having to make up new types which can have the unfortunate effect of adding a multitude of constructors and conversions which often clutter implementations and specifications.

Of course, we can do a lot more with measures, so lets press on!