Functions as Values
We have functions, but they are second-class entities in our languages: they don’t have the same abilities as other values.
For example, the following is rejected:
with a multiple error messages:
Errors found!
tests/input/hof.diamond:(2:3)-(4:1): Function 'it' is not defined
2| it(5)
^^
tests/input/hof.diamond:7:3-7: Unbound variable 'incr'
7| f(incr)
^^^^
This is because the Env only holds
- parameters, and
- let-bound variables
and not function definitions.
But for the many reasons we saw in CSE 130 – we want to treat functions like values. For example, if you run the above in Python you get:
Flashback: How do we compile incr?
We compile each function down into a sequence of instructions corresponding to its body.
becomes:
label_def_incr_start:
push ebp # setup stack frame
mov ebp, esp
mov eax, [ebp + 8] # grab param
mov ebx, 2 # incr by 1
add eax, ebx
mov esp, ebp # undo stack frame
pop ebp
ret # buh-bye
our_code_starts_here:
push ebp
mov ebp, esp
push DWORD 10 # push arg '5'
call label_def_incr_start # call function
add esp, 4 # pop arg from stack
mov esp, ebp
pop ebp
retWhat is the value of a function?
So now, lets take a step back. Suppose we want to compile
Attempt 1: What is the value of the parameter it ?
IDEA: Use the label where incr lives!
label_def_f_start:
push ebp
mov ebp, esp
mov eax, [ebp + 8] # grab function-address
push DWORD 10 # push arg '5'
call eax # call function!
add esp, 4 # pop arg from stack
mov esp, ebp
pop ebp
retHow to pass the value of the parameter ?
So now the main expression
f(incr)
can be compiled to:
our_code_starts_here:
push ebp
mov ebp, esp
push ?1 # push arg
call ?2 # call function
add esp, 4 # pop arg
mov esp, ebp
pop ebp
retQUIZ: What are suitable terms for ?1 and ?2 ?
?1 |
?2 |
|
|---|---|---|
| A | label_def_incr_start |
labal_def_f_start |
| B | label_def_f_start |
labal_def_incr_start |
| C | label_def_f_start |
labal_def_f_start |
| D | label_def_incr_start |
labal_def_incr_start |
Strategy Progression
Representation =
Start-Label- Problem: How to do run-time checks of valid args?
Yay, that was easy! How should the following behave?
Lets cheat and see what Python does:
>>> def f(it): return it(5)
>>> def add(x,y): return x + y
>>> f(add)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in f
TypeError: add() takes exactly 2 arguments (1 given)Problem: Ensure Valid Number of Arguments?
How to make sure
f(incr)succeeds, butf(add)fails
With proper run-time error?
- Where does the run-time check happen?
- What information is needed for the check?
Key: Need to also store the function’s arity
- The number of arguments required by the function
Strategy Progression
Representation =
Start-Label- Problem: How to do run-time checks of valid args?
Representation =
(Arity, Start-Label)
Attempt 2: What is the value of the parameter it ?
IDEA: Use a tuple of (arity, label)
We can now compile a call
e(x1,...,xn)
via the following strategy:
- Evaluate the tuple
e - Check that
e[0]is equal ton(else arity mismatch error) - Call the function at
e[1]
Example
Lets see how we would compile this
We need to map the variable
incr
to the tuple
(1, label_def_incr_start)
But where will we store this information?
Strategy Progression
Representation =
Start-Label- Problem: How to do run-time checks of valid args?
Representation =
(Arity, Start-Label)- Problem: How to map function names to tuples?
Lambda Terms Make functions just another expression!
Attempt 3: Lambda Terms
So far, we could only define functions at top-level
First-class functions are like any other expression,
Can define a function, wherever you have any other expression.
| Language | Syntax |
|---|---|
| Haskell | \(x1,...,xn) -> e |
| Ocaml | fun (x1,...,xn) -> e |
| JS | (x1,...,xn) => { return e } |
| C++ | [&](x1,...,xn){ return e } |
Example: Lambda Terms
We can now replace def as:
Implementation
As always, to the details! Lets figure out:
Representation
- How to store function-tuples
Types:
- Remove
Def - Add
lambdatoExpr
Transforms
- Update
tagandANF - Update
checker - Update
compile
Implementation: Representation
Represent lambda-tuples'' orfunction-tuples’’ via a special tag:
| Type | LSB |
|---|---|
number |
xx0 |
boolean |
111 |
pointer |
001 |
function |
101 |
In our code:
data Ty = ... | TClosure
typeTag :: Ty -> Arg
typeTag TTuple = HexConst 0x00000001
typeTag TClosure = HexConst 0x00000005
typeMask :: Ty -> Arg
typeMask TTuple = HexConst 0x00000007
typeMask TClosure = HexConst 0x00000007So, Function Values represented just like a tuples
- padding,
ESIetc. - but with tag
101.
Crucially, we can get 0-th, or 1-st elements from tuple.
Question: Why not use plain tuples?
Implementation: Types
First, lets look at the new Expr type
- No more
Def
data Expr a
= ...
| Lam [Bind a] !(Expr a) a -- fun. definitions
| App !(Expr a) [Expr a] a -- fun. callsSo we represent a function-definition as:
and a function call as:
Transforms: Tag
This is pretty straight forward (do it yourself)
Transforms: ANF
QUIZ:
Does e need to be
- A Immediate
- B ANF
- C None of the above
Transforms: ANF
QUIZ:
Do es need to be
- A Immediate
- B ANF
- C None of the above
Transforms: ANF
The App case, fun + args should be immediate
- Need the values to push on stack and make the call happen!
Just like function calls (in diamondback), except
- Must also handle the callee-expression (named
ebelow)
Transforms: ANF
QUIZ:
Does e need to be
- A Immediate
- B ANF
- C None of the above
Transforms: ANF
The Lam case, the body will be executed (when called)
- So we just need to make sure its in ANF (like all the code!)
Transforms: Checker
We just have Expr (no Def) so there is a single function:
wellFormed :: BareExpr -> [UserError]
wellFormed = go emptyEnv
where
gos = concatMap . go
go _ (Boolean {}) = ...
go _ (Number n l) = largeNumberErrors n l
go vEnv (Id x l) = unboundVarErrors vEnv x l
go vEnv (Prim1 _ e _) = ...
go vEnv (Prim2 _ e1 e2 _) = ...
go vEnv (If e1 e2 e3 _) = ...
go vEnv (Let x e1 e2 _) = ... ++ go vEnv e1 ++ go (addEnv x vEnv) e2
go vEnv (Tuple es _) = ...
go vEnv (GetItem e1 e2 _) = ...
go vEnv (App e es _) = ?1
go vEnv (Lam xs e _) = ?2 ++ go ?3 eHow shall we implement
?1?How shall we implement
?2?How shall we implement
?3?
Transforms: Compiler
Finally, lets see how to convert Expr into Asm, two separate cases:
Lam: definitionsApp: calls
Transforms: Compiler: Lam
compileEnv :: Env -> AExp -> [Instruction]
compileEnv env (Lam xs e l)
= IJmp end -- Why?
: ILabel start -- Function start
: compileDecl l xs e -- Function code (like Decl)
++ ILabel end -- Function end
: lamTuple arity start -- Compile fun-tuple into EAX
where
arity = length xs
start = LamStart l
end = LamEnd lQUESTION: Why do we start with the IJmp?
lamTuple :: Int -> Label -> [Instruction]
lamTuple arity start
= tupleAlloc 2 -- alloc tuple size = 2
++ tupleWrites [ repr arity -- fill arity
, CodePtr start ] -- fill code-ptr
++ [ IOr (Reg EAX) (typeTag TClosure) ] -- set the tag bitsTransforms: Compiler: App
Recall! IDEA: Use a tuple of (arity, label)
We can now compile a call
e(x1,...,xn)
via the following strategy:
- Evaluate the tuple
e - Check that
e[0]is equal ton(else arity mismatch error) - Call the function at
e[1]
compileEnv env (App vE vXs)
= assertType env vE TClosure -- check vE is a function
++ assertArity env vE (length vXs) -- check vE arity
++ tupleReadRaw (immArg env vE) (repr (1 :: Int)) -- load vE[1] into EAX
++ [IPush (param env vX) | vX <- reverse vXs] -- push args
++ [IPush (param env vE)] -- push closure-ptr
++ [ICall (Reg EAX)] -- call EAX
++ [IAdd (Reg ESP) (4 * (n + 1)] -- pop argsA Problem: Scope
Consider the following program:
A Problem Magnified: Dynamically Created Functions
Will it work? How about this variant:
let add = (lambda (n): (lambda (m): n + m))
, f = (lambda (it): it(5))
, plus1 = add(1)
, plus10 = add(10)
in
(f(plus1), f(plus10))add(1)should evaluate to a function-that-adds-1add(10)should evaluate to a function-that-adds-10
Yet, its the same code
- same arity
- same start-label
Problem: How can we represent different behaviors?
Free and Bound Variables
A variable x is bound inside an expression e if
xis a let-bound variable insidee.xis a formal parameter ine, OR
A variable x is free inside an expression e if
xis not bound insidee
For example consider the expression e :
m,tare bound insidee, but,nis free insidee
Computing Free Variables
Lets write a function to compute the set of free variables.
Question Why Set ?
freeVars :: Expr -> [Id]
freeVars e = S.toList (go e)
where
go :: Expr -> S.Set Id
go (Id x) = S.singleton x
go (Number _) = S.empty
go (Boolean _) = S.empty
go (If e e1 e2) = S.unions (map go [e1, e2, e3])
go (App e es) = S.unions (map go (e:es))
go (Let x e1 e2) = S.union (go e1) (S.delete x (go e2))
go (Lam xs e) = S.difference (go e) (S.fromList xs)=lambda (x1,x2,x3): e
let x = y + 10 in x + z
A. {x} B. {} C. { gobble_gobble }
TODO-IN-CLASS
Free Variables and Lambdas
Free Variables of a lambda
- Those whose values come from outside
- Should use the same values whenever we “call” the
lambda.
For example:
let add = (lambda (n): (lambda (m): n + m))
, f = (lambda (it): it(5))
, plus1 = add(1)
, plus10 = add(10)
in
(f(plus1), f(plus10), plus10(20))should evaluate to (6, 15, 30)
plus1be likelambda (m): 1 + mplus1be likelambda (m): 10 + m
Achieving Closure
(Recall from CSE 130)
Key Idea: Each function value must store its free variables
represent plus1 as:
(arity, code-label, [n := 1])
represent plus10 as:
(arity, code-label, [n := 10])
Same code, but different free variables.
Strategy Progression
Representation =
Start-Label- Problem: How to do run-time checks of valid args?
Representation =
(Arity, Start-Label)- Problem: How to map function names to tuples?
Lambda Terms Make functions just another expression!
- Problem: How to store local variables?
Function Value
(Arity, Start-Label, Free_1, ... , Free_N)- Ta Da!
Closures: Strategy
What if we have multiple free variables?
let foo = (lambda (x, y):
(lambda (z): x + y + z))
, plus10 = foo(4, 6)
, plus20 = foo(7, 13)
in
(plus10(0), plus20(100))represent plus10 as:
(arity, code-label, [x := 4], [y := 6])
represent plus20 as:
(arity, code-label, [x := 7], [y := 13])
Example
Lets see how to evaluate
TODO: PIC
Example
Lets see how to evaluate
let foo = (lambda (x, y):
(lambda (z): x + y + z))
, plus10 = foo(4, 6)
, f = (lambda (it): it(5))
in
f(plus10)TODO: PIC
Implementation
Representation
- How to store closures
Types:
- Same as before
Transforms
- Update
tagandANF- as before
Update
checkerUpdate
compile
Representation
We can represent a closure as a tuple of
(arity, code-ptr, free-var-1, ... free-var-N)
which means, following the convention for tuples, as:
-------------------------------------------------------------------------
| N + 2 | arity | code-ptr | var1 | var2 | ... | varN | (maybe padding) |
-------------------------------------------------------------------------
Where each cell represents 32-bits / 4-bytes / 1-word.
Note: (As with all tuples) the first word contains the #elements of the tuple.
- Which, in this case, it is
N + 2
Transforms: Checker
What environment should we use to check a Lam body ?
wellFormed :: BareExpr -> [UserError]
wellFormed = go emptyEnv
where
...
go vEnv (Lam xs e _) = errDupParams xs
++ go ?vEnv e
addsEnv :: Env -> [BareBind] -> Env
addsEnv env xs = foldr addEnv env xsQUIZ How shall we implement ?vEnv ?
A. addsEnv vEnv []
B. addsEnv vEnv xs
C. addsEnv emptyEnv xs
Transforms: Compile
Question How does the called function know the values of free vars?
Needs to restore them from closure tuple
Needs to access the closure tuple!
… But how shall we give the called function access to the tuple?
By passing the tuple as an extra parameter
Transforms: Compile
Calls App
- Push parameters
- Push closure-pointer-parameter
- Call code-label
- Pop params + pointer
Definitions Lam
- Compute free-vars
x1,…,xn - Generate code-block
- Restore free vars from closure-pointer-parameter
- Execute function body (as before)
- Allocate tuple
(arity, code-label, x1, ... , xn)
Transforms: Compile Calls
- Push parameters
- Push closure-pointer
- Call code-label
- Pop params + pointer
Transforms: Compile Definitions
- Compute free-vars
y1,…,yn - Generate code-block
- Restore free vars from closure-pointer-parameter
- Execute function body (as before)
- Allocate tuple
(arity, code-label, y1, ... , yn)
compileEnv :: Env -> AExp -> [Instruction]
compileEnv env (Lam xs e l)
= IJmp end -- Why?
: ILabel start -- Function start
: lambdaBody ys xs e -- Function code (like Decl)
++ ILabel end -- Function end
: lamTuple arity start env ys -- Compile closure-tuple into EAX
where
ys = freeVars (Lam xs e l)
arity = length xs
start = LamStart l
end = LamEnd lCreating Closure Tuples
To create the actual closure-tuple we need
- the free-variables
ys - the
envfrom which to values of the free variables.
lamTuple :: Int -> Label -> Env -> [Id] -> [Instruction]
lamTuple arity start env ys
= tupleAlloc (2 + length ys) -- alloc tuple 2 + |ys|
++ tupleWrites ( repr arity -- fill arity
: CodePtr start -- fill code-ptr
: [immArg env (Id y) | y <- ys] ) -- fill free-vars
++ [ IOr (Reg EAX) (typeTag TClosure) ] -- set the tag bitsGenerating Code Block
lambdaBody :: [Id] -> [Id] -> AExp -> [Instruction]
lambdaBody ys xs e = funInstrs maxStack
( restore ys -- restore free vars from closure-ptr
++ compileEnv env e ) -- exec function-body as before
where
maxStack = envMax env + countVars e -- max stack size
env = fromListEnv bs
bs = zip xs [-2,-3..] -- put params into env/stack
++ zip ys [1..] -- put free-vars into env/stackTo restore ys we use the closure-ptr passed in at [EBP+8] – the special first parameter – to copy the free-vars ys onto the stack.
restore :: [Id] -> [Instruction]
restore ys = concat [ copy i | (y, i) <- zip ys [1..]]
where
closPtr = RegOffset 8 EBP
copy i = tupleReadRaw closPtr (repr (i+1)) -- copy tuple-fld for y into EAX...
++ [ IMov (stackVar i) (Reg EAX) ] -- ...write EAX into stackVar for yA Problem: Recursion
Oops, how do we write:
If we try
We get a variable unbound error!
Errors found!
tests/input/fac-bad.fdl:5:20-23: Unbound variable 'fac'
5| n * fac(n-1))
We need to teach our compiler that its ok to use the name fac inside the body!
Solution: Named Functions
We have a new form of named functions
- Like Ocaml’s
let rec
Which looks like this:
Representing Named Functions
We extend Expr to handle such functions as:
data Expr a
= ...
| Fun (Bind a) -- ^ name of function
[Bind a] -- ^ list of parameters
(Expr a) a -- ^ body of functionNote that we parse the code
as the Expr
Compiling Named Functions
Mostly, this is left as an exercise to you.
Non-Recursive functions
- i.e.
fdoes not appear insideeinFun f xs e - Treat
Fun f xs easLam xs e… - … Everything should just work.
Recursive
- i.e.
fdoes appear insideeinFun f xs e - Can you think of a simple tweak to the
Lamstrategy that works?
Recap: Functions as Values
We have functions, but they are second-class entities in our languages: they don’t have the same abilities as other values.
Representation =
Start-Label- Problem: How to do run-time checks of valid args?
Representation =
(Arity, Start-Label)- Problem: How to map function names to tuples?
Lambda Terms Make functions just another expression!
- Problem: How to store local variables?
Function Value
(Arity, Start-Label, Free_1, ... , Free_N)- Ta Da!
Next: Adding static type inference
- Faster! Gets rid of those annoying (and slow!) run-time checks
- Safer! Catches problems at compile-time, when easiest to fix!