loop
Refinement types shine when we want to establish properties of polymorphic datatypes and higher-order functions. Rather than be abstract, let’s illustrate this with a classic use-case.
Array Bounds Verification aims to ensure that the
indices used to retrieve values from an array are indeed valid
for the array, i.e. are between 0
and the size of
the array. For example, suppose we create an array
with two
elements:
twoLangs = fromList ["haskell", "javascript"]
Lets attempt to look it up at various indices:
If we try to run the above, we get a nasty shock: an
exception that says we’re trying to look up twoLangs
at
index 3
whereas the size of twoLangs
is just
2
.
Prelude> :l 03-poly.lhs
[1 of 1] Compiling VectorBounds ( 03-poly.lhs, interpreted )
Ok, modules loaded: VectorBounds.
*VectorBounds> eeks
Loading package ... done.
"*** Exception: ./Data/Vector/Generic.hs:249 ((!)): index out of bounds (3,2)
In a suitable Editor e.g. Vim or Emacs, or if you push
the “play” button in the online demo, you will literally see the error
without running the code. Lets see how LiquidHaskell checks
ok
and yup
but flags nono
, and
along the way, learn how it reasons about recursion,
higher-order functions, data types and
polymorphism.
First, let’s see how to specify array bounds safety by
refining the types for the key
functions exported by Data.Vector
, i.e. how to
Vector
Vector
We can write specifications for imported modules – for which we
lack the code – either directly in the client’s source file or
better, in .spec
files which can be reused across multiple
client modules.
Include directories can be specified when checking a
file. Suppose we want to check some file target.hs
that
imports an external dependency Data.Vector
. We can write
specifications for Data.Vector
inside
include/Data/Vector.spec
which contains:
-- | Define the size
measure vlen :: Vector a -> Int
-- | Compute the size
assume length :: x:Vector a -> {v:Int | v = vlen x}
-- | Lookup at an index
assume (!) :: x:Vector a -> {v:Nat | v < vlen x} -> a
Using this new specification is now a simple matter of telling LiquidHaskell to include this file:
$ liquid -i include/ target.hs
LiquidHaskell ships with specifications for Prelude
,
Data.List
, and Data.Vector
which it includes
by default.
Measures are used to define properties of
Haskell data values that are useful for specification and verification.
Think of vlen
as the actual size of a
Vector
regardless of how the size was computed.
Assumes are used to specify types describing
the semantics of functions that we cannot verify e.g. because we don’t
have the code for them. Here, we are assuming that the library function
Data.Vector.length
indeed computes the size of the input
vector. Furthermore, we are stipulating that the lookup function
(!)
requires an index that is between 0
and
the real size of the input vector x
.
Dependent Refinements are used to describe
relationships between the elements of a specification. For
example, notice how the signature for length
names the
input with the binder x
that then appears in the output
type to constrain the output Int
. Similarly, the signature
for (!)
names the input vector x
so that the
index can be constrained to be valid for x
. Thus,
dependency lets us write properties that connect multiple
program values.
Aliases are extremely useful for defining
abbreviations for commonly occurring types. Just as we enjoy
abstractions when programming, we will find it handy to have
abstractions in the specification mechanism. To this end, LiquidHaskell
supports type aliases. For example, we can define
Vector
s of a given size N
as:
and now use this to type twoLangs
above as:
Similarly, we can define an alias for Int
values between
Lo
and Hi
:
after which we can specify (!)
as:
(!) :: x:Vector a -> Btwn 0 (vlen x) -> a
Let’s try to write some functions to sanity check the specifications.
First, find the starting element – or head
of a
Vector
When we check the above, we get an error:
src/03-poly.lhs:127:23: Error: Liquid Type Mismatch
Inferred type
VV : Int | VV == ?a && VV == 0
not a subtype of Required type
VV : Int | VV >= 0 && VV < vlen vec
In Context
VV : Int | VV == ?a && VV == 0
vec : Vector a | 0 <= vlen vec
?a : Int | ?a == (0 : int)
LiquidHaskell is saying that 0
is not a valid
index as it is not between 0
and vlen vec
. Say
what? Well, what if vec
had no elements! A formal
verifier doesn’t make off by one errors.
To Fix the problem we can do one of two things.
vec
be non-empty,
orvec
is non-empty, orHere’s an implementation of the first approach, where we define and
use an alias NEVector
for non-empty
Vector
s
Exercise: (Vector Head): Replace the
undefined
with an implementation of
head''
which accepts all Vector
s but
returns a value only when the input vec
is not empty.
Exercise: (Unsafe Lookup): The function
unsafeLookup
is a wrapper around the (!)
with
the arguments flipped. Modify the specification for
unsafeLookup
so that the implementation is
accepted by LiquidHaskell.
Exercise: (Safe Lookup): Complete the implementation of
safeLookup
by filling in the implementation of
ok
so that it performs a bounds check before the
access.
Ok, let’s write some code! Let’s start with a recursive function that
adds up the values of the elements of an Int
vector.
Exercise: (Guards): What happens if you
replace the guard with i <= sz
?
Exercise: (Absolute Sum): Write a variant of the above
function that computes the absoluteSum
of the elements of
the vector.
LiquidHaskell verifies vectorSum
– or, to be precise,
the safety of the vector accesses vec ! i
. The verification
works out because LiquidHaskell is able to automatically infer
1
go :: Int -> {v:Int | 0 <= v && v <= sz} -> Int
which states that the second parameter i
is between
0
and the length of vec
(inclusive).
LiquidHaskell uses this and the test that i < sz
to
establish that i
is between 0
and
(vlen vec)
to prove safety.
Note you need to run liquid
with the
option --no-termination
or make sure your source file has
{-@ LIQUID "--no-termination" @-}, otherwise the code for
go`
fails the now default termination check. We will come back to this
example later to see how to verify termination using metrics.
Exercise: (Off by one?): Why does the type of
go
have v <= sz
and not
v < sz
?
loop
Let’s refactor the above low-level recursive function into a generic
higher-order loop
.
We can now use loop
to implement
vectorSum
:
Inference is a convenient option. LiquidHaskell
finds:
loop :: lo:Nat -> hi:{Nat|lo <= hi} -> a -> (Btwn lo hi -> a -> a) -> a
In English, the above type states that
lo
the loop lower bound is a non-negative
integerhi
the loop upper bound is a greater then or
equal to lo
,f
the loop body is only called with integers
between lo
and hi
. It can be tedious to have to keep typing things like the above. If
we wanted to make loop
a public or exported function, we
could use the inferred type to generate an explicit signature.
At the call loop 0 n 0 body
the parameters
lo
and hi
are instantiated with 0
and n
respectively, which, by the way is where the
inference engine deduces non-negativity. Thus LiquidHaskell concludes
that body
is only called with values of i
that
are between 0
and (vlen vec)
, which
verifies the safety of the call vec ! i
.
Exercise: (Using Higher-Order Loops): Complete the
implementation of absoluteSum'
below. When you are done,
what is the type that is inferred for body
?
loop
to compute dotProduct
s. Why does
LiquidHaskell flag an error? Fix the code or specification so that
LiquidHaskell accepts it.
While the standard Vector
is great for dense
arrays, often we have to manipulate sparse vectors where most
elements are just 0
. We might represent such vectors as a
list of index-value tuples:
Implicitly, all indices other than those in the list have
the value 0
(or the equivalent value for the type
a
).
The Alias SparseN
is just a shorthand for
the (longer) type on the right, it does not define a new type.
If you are familiar with the index-style length encoding
e.g. as found in DML or Agda, then
note that despite appearances, our Sparse
definition is
not indexed.
Let’s write a function to compute a sparse product
LiquidHaskell verifies the above by using the specification to
conclude that for each tuple (i, v)
in the list
y
, the value of i
is within the bounds of the
vector x
, thereby proving x ! i
safe.
The sharp reader will have undoubtedly noticed that the sparse product can be more cleanly expressed as a fold:
foldl' :: (a -> b -> a) -> a -> [b] -> a
We can simply fold over the sparse vector, accumulating the
sum
as we go along
LiquidHaskell digests this without difficulty. The main trick is in
how the polymorphism of foldl'
is instantiated.
GHC infers that at this site, the type variable b
from the signature of foldl'
is instantiated to the Haskell
type (Int, a)
.
Correspondingly, LiquidHaskell infers that in fact b
can be instantiated to the refined
(Btwn 0 (vlen x), a)
.
Thus, the inference mechanism saves us a fair bit of typing and allows us to reuse existing polymorphic functions over containers and such without ceremony.
This chapter gave you an idea of how one can use refinements to verify size related properties, and more generally, to specify and verify properties of recursive and polymorphic functions. Next, let’s see how we can use LiquidHaskell to prevent the creation of illegal values by refining data type definitions.
In your editor, click on go
to see the
inferred type.↩︎