CSV Tables
Most demonstrations for verification techniques involve programs with complicated
invariants and properties. However, these methods can often be rather useful for
describing simple but important aspects of APIs or programs with more humble
goals. I saw a rather nice example of using Scala's
Shapeless
library for preventing off-by-one errors in CSV processing
code. Here's the same, short, example rephrased with LiquidHaskell.
23: module CSV where 24: 25: -- | Using LiquidHaskell for CSV lists 26: -- c.f. http://www.reddit.com/r/scala/comments/1nhzi2/using_shapelesss_sized_type_to_eliminate_real/
The Type¶
Suppose you wanted to represent tables as a list of comma-separated values.
For example, here's a table listing the articles and prices at the coffee shop I'm sitting in right now:
Item | Price |
---|---|
Espresso | 2.25 |
Macchiato | 2.75 |
Cappucino | 3.35 |
Americano | 2.25 |
You might represent this with a simple Haskell data type:
64: 65: data CSV = Csv { (CSV.CSV) -> [[(GHC.Types.Char)]]headers :: [String] 66: , (CSV.CSV) -> [[[(GHC.Types.Char)]]]rows :: [[String]] 67: }
and now, the above table is just:
73: (CSV.CSV)zumbarMenu = x1:[[(GHC.Types.Char)]] -> [{VV : [[(GHC.Types.Char)]] | ((len VV) == (len x1))}] -> (CSV.CSV)Csv {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[ {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Item" , {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Price"] 74: [ {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Espresso" , {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"2.25" ] 75: , {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Macchiato", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"2.75" ] 76: , {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Cappucino", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"3.35" ] 77: , {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Americano", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"2.25" ] 78: ]
The Errors¶
Our CSV
type supports tables with an arbitrary number of headers
and
rows
but of course, we'd like to ensure that each row
has data for each
header, that is, we don't end up with tables like this one
89: -- Eeks, we missed the header name! 90: 91: (CSV.CSV)csvBad1 = x1:[[(GHC.Types.Char)]] -> [{VV : [[(GHC.Types.Char)]] | ((len VV) == (len x1))}] -> (CSV.CSV)Csv {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> false> | (((null VV)) <=> false) && ((len VV) >= 0)}[ {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Date" {- ??? -} ] 92: [ {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) > 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Mon", {VV : [(GHC.Types.Char)]<\_ VV -> false> | (((null VV)) <=> false) && ((len VV) >= 0)}"1"] 93: , {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) > 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Tue", {VV : [(GHC.Types.Char)]<\_ VV -> false> | (((null VV)) <=> false) && ((len VV) >= 0)}"2"] 94: , {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) > 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Wed", {VV : [(GHC.Types.Char)]<\_ VV -> false> | (((null VV)) <=> false) && ((len VV) >= 0)}"3"] 95: ] 96:
or this one,
102: -- Blergh! we missed a column. 103: 104: (CSV.CSV)csvBad2 = x1:[[(GHC.Types.Char)]] -> [{VV : [[(GHC.Types.Char)]] | ((len VV) == (len x1))}] -> (CSV.CSV)Csv {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[ {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Name" , {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Age" ] 105: [ {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Alice", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"32" ] 106: , {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> false> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Bob" {- ??? -}] 107: , {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Cris" , {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"29" ] 108: ]
Alas, both the above are valid inhabitants of the Haskell CSV
type, and
so, sneak past GHC.
The Invariant¶
Thus, we want to refine the CSV
type to specify that the number of
elements in each row is exactly the same as the number of headers.
To do so, we merely write a refined data type definition:
123: {-@ data CSV = Csv { headers :: [String] 124: , rows :: [{v:[String] | (len v) = (len headers)}] 125: } 126: @-}
Here len
is a measure denoting the length of a list.
Thus, (len headers)
is the number of headers in the table, and the
refinement on the rows
field states that each row
is a list of String
s,
with exactly the same number of elements as the number of headers
.
We can now have our arbitrary-arity tables, but LiquidHaskell will make sure that we don't miss entries here or there.
138: -- All is well! 139: 140: (CSV.CSV)csvGood = x1:[[(GHC.Types.Char)]] -> [{VV : [[(GHC.Types.Char)]] | ((len VV) == (len x1))}] -> (CSV.CSV)Csv {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Id", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Name", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Days"] 141: [ {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)]<\_ VV -> false> | (((null VV)) <=> false) && ((len VV) >= 0)}"1", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Jan", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"31"] 142: , {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)]<\_ VV -> false> | (((null VV)) <=> false) && ((len VV) >= 0)}"2", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Feb", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"28"] 143: , {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)]<\_ VV -> false> | (((null VV)) <=> false) && ((len VV) >= 0)}"3", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Mar", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"31"] 144: , {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)]<\_ VV -> false> | (((null VV)) <=> false) && ((len VV) >= 0)}"4", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Apr", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"30"] 145: ]
Bonus Points¶
How would you modify the specification to prevent table with degenerate entries like this one?
155: (CSV.CSV)deg = x1:[[(GHC.Types.Char)]] -> [{VV : [[(GHC.Types.Char)]] | ((len VV) == (len x1))}] -> (CSV.CSV)Csv {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[ {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Id", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Name", {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Days"] 156: [ {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)]<\_ VV -> false> | (((null VV)) <=> false) && ((len VV) >= 0)}"1" , {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Jan" , {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"31"] 157: , {VV : [{VV : [(GHC.Types.Char)] | ((len VV) >= 0)}]<\_ VV -> ((len VV) >= 0)> | (((null VV)) <=> false) && ((len VV) >= 0)}[{VV : [(GHC.Types.Char)]<\_ VV -> false> | (((null VV)) <=> false) && ((len VV) >= 0)}"2" , {VV : [(GHC.Types.Char)] | ((len VV) >= 0)}"Feb" , {VV : [{VV : (GHC.Types.Char) | false}]<\_ VV -> false> | (((null VV)) <=> true) && ((len VV) == 0) && ((len VV) >= 0)}""] 158: ]