**1.**Introduction {#intro}**1.1.**Well-Typed Programs Do Go Wrong {#gowrong}**1.2.**Refinement Types**1.3.**Audience**1.4.**Getting Started**2.**Logic & SMT**2.1.**Syntax**2.2.**Semantics {#semantics}**2.3.**Verification Conditions**2.4.**Examples: Propositions**2.5.**Examples: Arithmetic**2.6.**Examples: Uninterpreted Function**2.7.**Recap**3.**Refinement Types**3.1.**Defining Types {#definetype}**3.2.**Errors**3.3.**Subtyping**3.4.**Writing Specifications**3.5.**Refining Function Types: Pre-conditions**3.6.**Refining Function Types: Post-conditions**3.7.**Testing Values: Booleans and Propositions {#propositions}**3.8.**Putting It All Together**3.9.**Recap**4.**Polymorphism {#polymorphism}**4.1.**Specification: Vector Bounds {#vectorbounds}**4.2.**Verification: Vector Lookup**4.3.**Inference: Our First Recursive Function**4.4.**Higher-Order Functions: Bottling Recursion in a`loop`

**4.5.**Refinements and Polymorphism {#sparsetype}**4.6.**Recap**5.**Refined Datatypes {#refineddatatypes}**5.1.**Sparse Vectors Revisited {#autosmart}**5.2.**Ordered Lists {#orderedlists}**5.3.**Ordered Trees {#binarysearchtree}**5.4.**Recap**6.**Boolean Measures {#boolmeasures}**6.1.**Partial Functions**6.2.**Lifting Functions to Measures {#usingmeasures}**6.3.**A Safe List API**6.4.**Recap**7.**Numeric Measures {#numericmeasure}**7.1.**Wholemeal Programming**7.2.**Specifying List Dimensions**7.3.**Lists: Size Preserving API**7.4.**Lists: Size Reducing API {#listreducing}**7.5.**Dimension Safe Vector API**7.6.**Dimension Safe Matrix API**7.7.**Recap**8.**Elemental Measures {#setmeasure}**8.1.**Talking about Sets**8.2.**Proving QuickCheck Style Properties {#quickcheck}**8.3.**Content-Aware List API {#listelems}**8.4.**Permutations**8.5.**Uniqueness**8.6.**Unique Zippers**8.7.**Recap**9.**Case Study: Okasaki's Lazy Queues {#lazyqueue}**10.**Case Study: Associative Maps**10.1.**Specifying Maps {#mapapi}**10.2.**Using Maps: Well Scoped Expressions**10.3.**Implementing Maps: Binary Search Trees {#lemmas}**10.4.**Recap**11.**Case Study: Pointers & Bytes {#case-study-pointers}**11.1.**HeartBleeds in Haskell**11.2.**Low-level Pointer API**11.3.**A Refined Pointer API**11.4.**Assumptions vs Guarantees**11.5.**ByteString API**11.6.**Application API**11.7.**Nested ByteStrings**11.8.**Recap: Types Against Overflows**12.**Case Study: AVL Trees {#case-study-avltree}

{-@ LIQUID "--short-names" @-}
{-@ LIQUID "--no-termination" @-}
{-@ LIQUID "--scrape-used-imports" @-}
module Tutorial_04_Polymorphism
( safeLookup
, unsafeLookup
, vectorSum, vectorSum'
, absoluteSum, absoluteSum'
, dotProduct
, sparseProduct, sparseProduct'
, eeks
, head, head', head''
) where
import Prelude hiding (head, abs, length)
import Data.List (foldl')
import Data.Vector hiding (head, foldl')
absoluteSum' :: Vector Int -> Int
dotProduct :: Vector Int -> Vector Int -> Int
absoluteSum :: Vector Int -> Int
sparseProduct, sparseProduct' :: Vector Int -> [(Int, Int)] -> Int
-- {-@ fail eeks @-}
-- {-@ fail head @-}
-- {-@ fail unsafeLookup @-}
-- {-@ fail dotProduct @-}

Refinement types shine when we want to establish properties of
*polymorphic* datatypes and higher-order functions. Rather than
be abstract, let’s illustrate this with a classic
use-case.

**Array Bounds Verification** aims to ensure that the
indices used to retrieve values from an array are indeed *valid*
for the array, i.e. are between `0`

and the *size* of
the array. For example, suppose we create an `array`

with two
elements:

`twoLangs = fromList ["haskell", "javascript"]`

Lets attempt to look it up at various indices:

eeks = [ok, yup, nono]
where
ok = twoLangs ! 0
yup = twoLangs ! 1
nono = twoLangs ! 3

If we try to *run* the above, we get a nasty shock: an
exception that says we’re trying to look up `twoLangs`

at
index `3`

whereas the size of `twoLangs`

is just
`2`

.

```
Prelude> :l 03-poly.lhs
[1 of 1] Compiling VectorBounds ( 03-poly.lhs, interpreted )
Ok, modules loaded: VectorBounds.
*VectorBounds> eeks
Loading package ... done.
"*** Exception: ./Data/Vector/Generic.hs:249 ((!)): index out of bounds (3,2)
```

**In a suitable Editor** e.g. Vim or Emacs, or if you push
the “play” button in the online demo, you will literally see the error
*without* running the code. Lets see how LiquidHaskell checks
`ok`

and `yup`

but flags `nono`

, and
along the way, learn how it reasons about *recursion*,
*higher-order functions*, *data types* and
*polymorphism*.

First, let’s see how to *specify* array bounds safety by
*refining* the types for the key
functions exported by `Data.Vector`

, i.e. how to

*define*the size of a`Vector`

*compute*the size of a`Vector`

*restrict*the indices to those that are valid for a given size.

We can write specifications for imported modules – for which we
*lack* the code – either directly in the client’s source file or
better, in `.spec`

files which can be reused across multiple
client modules.

**Include** directories can be specified when checking a
file. Suppose we want to check some file `target.hs`

that
imports an external dependency `Data.Vector`

. We can write
specifications for `Data.Vector`

inside
`include/Data/Vector.spec`

which contains:

```
-- | Define the size
measure vlen :: Vector a -> Int
-- | Compute the size
assume length :: x:Vector a -> {v:Int | v = vlen x}
-- | Lookup at an index
assume (!) :: x:Vector a -> {v:Nat | v < vlen x} -> a
```

Using this new specification is now a simple matter of telling LiquidHaskell to include this file:

`$ liquid -i include/ target.hs`

LiquidHaskell ships with specifications for `Prelude`

,
`Data.List`

, and `Data.Vector`

which it includes
by default.

**Measures** are used to define *properties* of
Haskell data values that are useful for specification and verification.
Think of `vlen`

as the *actual* size of a
`Vector`

regardless of how the size was computed.

**Assumes** are used to *specify* types describing
the semantics of functions that we cannot verify e.g. because we don’t
have the code for them. Here, we are assuming that the library function
`Data.Vector.length`

indeed computes the size of the input
vector. Furthermore, we are stipulating that the lookup function
`(!)`

requires an index that is between `0`

and
the real size of the input vector `x`

.

**Dependent Refinements** are used to describe
relationships *between* the elements of a specification. For
example, notice how the signature for `length`

names the
input with the binder `x`

that then appears in the output
type to constrain the output `Int`

. Similarly, the signature
for `(!)`

names the input vector `x`

so that the
index can be constrained to be valid for `x`

. Thus,
dependency lets us write properties that connect *multiple*
program values.

**Aliases** are extremely useful for defining
*abbreviations* for commonly occurring types. Just as we enjoy
abstractions when programming, we will find it handy to have
abstractions in the specification mechanism. To this end, LiquidHaskell
supports *type aliases*. For example, we can define
`Vector`

s of a given size `N`

as:

{-@ type VectorN a N = {v:Vector a | vlen v == N} @-}

and now use this to type `twoLangs`

above as:

{-@ twoLangs :: VectorN String 2 @-}
twoLangs = fromList ["haskell", "javascript"]

Similarly, we can define an alias for `Int`

values between
`Lo`

and `Hi`

:

{-@ type Btwn Lo Hi = {v:Int | Lo <= v && v < Hi} @-}

after which we can specify `(!)`

as:

`(!) :: x:Vector a -> Btwn 0 (vlen x) -> a`

Let’s try to write some functions to sanity check the specifications.
First, find the starting element – or `head`

of a
`Vector`

head :: Vector a -> a
head vec = vec ! 0

When we check the above, we get an error:

```
src/03-poly.lhs:127:23: Error: Liquid Type Mismatch
Inferred type
VV : Int | VV == ?a && VV == 0
not a subtype of Required type
VV : Int | VV >= 0 && VV < vlen vec
In Context
VV : Int | VV == ?a && VV == 0
vec : Vector a | 0 <= vlen vec
?a : Int | ?a == (0 : int)
```

LiquidHaskell is saying that `0`

is *not* a valid
index as it is not between `0`

and `vlen vec`

. Say
what? Well, what if `vec`

had *no* elements! A formal
verifier doesn’t make *off by one* errors.

**To Fix** the problem we can do one of two things.

*Require*that the input`vec`

be non-empty, or*Return*an output if`vec`

is non-empty, or

Here’s an implementation of the first approach, where we define and
use an alias `NEVector`

for non-empty
`Vector`

s

{-@ type NEVector a = {v:Vector a | 0 < vlen v} @-}
{-@ head' :: NEVector a -> a @-}
head' vec = vec ! 0

**Exercise: (Vector Head): **Replace the
`undefined`

with an *implementation* of
`head''`

which accepts *all* `Vector`

s but
returns a value only when the input `vec`

is not empty.

head'' :: Vector a -> Maybe a
head'' vec = undefined

**Exercise: (Unsafe Lookup): **The function
`unsafeLookup`

is a wrapper around the `(!)`

with
the arguments flipped. Modify the specification for
`unsafeLookup`

so that the *implementation* is
accepted by LiquidHaskell.

{-@ unsafeLookup :: Int -> Vector a -> a @-}
unsafeLookup index vec = vec ! index

**Exercise: (Safe Lookup): **Complete the implementation of
`safeLookup`

by filling in the implementation of
`ok`

so that it performs a bounds check before the
access.

{-@ safeLookup :: Vector a -> Int -> Maybe a @-}
safeLookup x i
| ok = Just (x ! i)
| otherwise = Nothing
where
ok = undefined

Ok, let’s write some code! Let’s start with a recursive function that
adds up the values of the elements of an `Int`

vector.

-- >>> vectorSum (fromList [1, -2, 3])
-- 2
vectorSum :: Vector Int -> Int
vectorSum vec = go 0 0
where
go acc i
| i < sz = go (acc + (vec ! i)) (i + 1)
| otherwise = acc
sz = length vec

**Exercise: (Guards): **What happens if you
*replace* the guard with `i <= sz`

?

**Exercise: (Absolute Sum): **Write a variant of the above
function that computes the `absoluteSum`

of the elements of
the vector.

-- >>> absoluteSum (fromList [1, -2, 3])
-- 6
{-@ absoluteSum :: Vector Int -> Nat @-}
absoluteSum = undefined

LiquidHaskell verifies `vectorSum`

– or, to be precise,
the safety of the vector accesses `vec ! i`

. The verification
works out because LiquidHaskell is able to *automatically infer*
^{1}

`go :: Int -> {v:Int | 0 <= v && v <= sz} -> Int`

which states that the second parameter `i`

is between
`0`

and the length of `vec`

(inclusive).
LiquidHaskell uses this and the test that `i < sz`

to
establish that `i`

is between `0`

and
`(vlen vec)`

to prove safety.

**Note** you need to run `liquid`

with the
option `--no-termination`

or make sure your source file has
`{-@ LIQUID "--no-termination" @-}, otherwise the code for`

go`
fails the now default termination check. We will come back to this
example later to see how to verify termination using metrics.

**Exercise: (Off by one?): **Why does the type of
`go`

have `v <= sz`

and not
`v < sz`

?

`loop`

Let’s refactor the above low-level recursive function into a generic
higher-order `loop`

.

loop :: Int -> Int -> a -> (Int -> a -> a) -> a
loop lo hi base f = go base lo
where
go acc i
| i < hi = go (f i acc) (i + 1)
| otherwise = acc

We can now use `loop`

to implement
`vectorSum`

:

vectorSum' :: Vector Int -> Int
vectorSum' vec = loop 0 n 0 body
where
body i acc = acc + (vec ! i)
n = length vec

**Inference** is a convenient option. LiquidHaskell
finds:

`loop :: lo:Nat -> hi:{Nat|lo <= hi} -> a -> (Btwn lo hi -> a -> a) -> a`

In English, the above type states that

`lo`

the loop*lower*bound is a non-negative integer`hi`

the loop*upper*bound is a greater then or equal to`lo`

,`f`

the loop*body*is only called with integers between`lo`

and`hi`

.

It can be tedious to have to keep typing things like the above. If
we wanted to make `loop`

a public or exported function, we
could use the inferred type to generate an explicit signature.

At the call `loop 0 n 0 body`

the parameters
`lo`

and `hi`

are instantiated with `0`

and `n`

respectively, which, by the way is where the
inference engine deduces non-negativity. Thus LiquidHaskell concludes
that `body`

is only called with values of `i`

that
are *between* `0`

and `(vlen vec)`

, which
verifies the safety of the call `vec ! i`

.

**Exercise: (Using Higher-Order Loops): **Complete the
implementation of `absoluteSum'`

below. When you are done,
what is the type that is inferred for `body`

?

-- >>> absoluteSum' (fromList [1, -2, 3])
-- 6
{-@ absoluteSum' :: Vector Int -> Nat @-}
absoluteSum' vec = loop 0 n 0 body
where
body i acc = undefined
n = length vec

`loop`

to compute `dotProduct`

s. Why does
LiquidHaskell flag an error? Fix the code or specification so that
LiquidHaskell accepts it.
-- >>> dotProduct (fromList [1,2,3]) (fromList [4,5,6])
-- 32
{-@ dotProduct :: x:Vector Int -> y:Vector Int -> Int @-}
dotProduct x y = loop 0 sz 0 body
where
body i acc = acc + (x ! i) * (y ! i)
sz = length x

While the standard `Vector`

is great for *dense*
arrays, often we have to manipulate *sparse* vectors where most
elements are just `0`

. We might represent such vectors as a
list of index-value tuples:

{-@ type SparseN a N = [(Btwn 0 N, a)] @-}

Implicitly, all indices *other* than those in the list have
the value `0`

(or the equivalent value for the type
`a`

).

**The Alias** `SparseN`

is just a shorthand for
the (longer) type on the right, it does not *define* a new type.
If you are familiar with the *index-style* length encoding
e.g. as found in DML or Agda, then
note that despite appearances, our `Sparse`

definition is
*not* indexed.

Let’s write a function to compute a sparse product

{-@ sparseProduct :: x:Vector _ -> SparseN _ (vlen x) -> _ @-}
sparseProduct x y = go 0 y
where
go n [] = n
go n ((i,v):y') = go (n + (x!i) * v) y'

LiquidHaskell verifies the above by using the specification to
conclude that for each tuple `(i, v)`

in the list
`y`

, the value of `i`

is within the bounds of the
vector `x`

, thereby proving `x ! i`

safe.

The sharp reader will have undoubtedly noticed that the sparse product can be more cleanly expressed as a fold:

`foldl' :: (a -> b -> a) -> a -> [b] -> a`

We can simply fold over the sparse vector, accumulating the
`sum`

as we go along

{-@ sparseProduct' :: x:Vector _ -> SparseN _ (vlen x) -> _ @-}
sparseProduct' x y = foldl' body 0 y
where
body sum (i, v) = sum + (x ! i) * v

LiquidHaskell digests this without difficulty. The main trick is in
how the polymorphism of `foldl'`

is instantiated.

GHC infers that at this site, the type variable

`b`

from the signature of`foldl'`

is instantiated to the Haskell type`(Int, a)`

.Correspondingly, LiquidHaskell infers that in fact

`b`

can be instantiated to the*refined*`(Btwn 0 (vlen x), a)`

.

Thus, the inference mechanism saves us a fair bit of typing and allows us to reuse existing polymorphic functions over containers and such without ceremony.

This chapter gave you an idea of how one can use refinements to verify size related properties, and more generally, to specify and verify properties of recursive and polymorphic functions. Next, let’s see how we can use LiquidHaskell to prevent the creation of illegal values by refining data type definitions.

In your editor, click on

`go`

to see the inferred type.↩︎