Defunctionalizing Arithmetic to an Abstract Machine

There is great value in meditating upon the simplest possible example of a thing you can come up with. In this case one can apply defunctionalization techniques that are often applied to lambda calculus to simpler arithmetic calculations.

Functional programming is cool and useful, but it isn’t clear how to implement the features they provide on hardware that is controlled by assembly code. Achieving this is a fairly large topic. One step on the way is the concept of an abstract machine.

Abstract machines make more explicit how to evaluate a program by defining a step relationship taking a state of the machine to another state. I think this may be closer to how hardware is built because hardware is physical system. Physical systems are often characterizable by their space of states and the transitions or time evolution of them. That’s Newtonian mechanics in a nutshell.

There is a methodology by which to connect the definitions of abstract machines to interpreters of lambda calculus.

  • Convert to continuation passing style to make the evaluation order explicit
  • Defunctionalize these continuations

However, the lambda calculus is a non trivial beast and really only a member of a spectrum of different programming language features. Here is an incomplete set of features that you can mix and match:

  • Arithmetic expressions
  • Boolean expressions
  • let bindings
  • Printing/Output
  • Reading/Input
  • Mutation, References
  • For/While loops
  • Named Global Procedures
  • Recursion
  • Lambda terms / Higher Order Functions
  • Call/CC
  • error throw try catch
  • Algebraic Data Types
  • Pattern matching

In my opinion, the simplest of any of these is arithmetic expressions and with only this you can already meaningfully explore this evaluator to abstract machine translation.

First we need a data type for arithmetic

data AExpr = Lit Int | Add AExpr AExpr deriving (Eq, Show)

Pretty basic. We could easily add multiplication and other operators and it doesn’t change much conceptually except make things larger. Then we can define a simple interpreter.

type Value = Int

eval :: AExpr -> Value
eval (Add x y) = (eval x) + (eval y)
eval (Lit i) = i

The first step of our transformation is to put everything in continuation passing style (cps). The way this is done is to add an extra parameter k to every function call. When we want to return a result from a function, we now call k with that instead. You can kind of think of it as a goofy return statement. eval' is equivalent to eval above.

evalk :: AExpr -> (Value -> Value) -> Value
evalk (Add x y) k = evalk x (\vx -> (evalk y $ \vy -> k (vx + vy)))
evalk (Lit i) k = k i

eval' :: AExpr -> Value
eval' e = evalk e id

Now we defunctionalize this continuation. We note that higher order continuation parameters take only a finite number of possible shapes if evalk is only accessed via the above code. k can either be id, (\vx -> (evalk y $ \vy -> k (vx + vy))) , or \vy -> k (vx + vy). We give each of these code shapes a constructor in a data type. The constructor needs to hold any values closed over (free variables in the expression). id needs to remember nothing, \vx -> (evalk y $ \vy -> k (vx + vy)) needs to remember y and k, and \vy -> k (vx + vy) needs to remember vx and k.

data AHole = IdDone | AddL AExpr AHole | AddR Value AHole 

What functions are is a thing that can be applied to it’s arguments. We can use AHole exactly as before by defining an apply function.

apply :: AHole -> Value -> Value 
apply IdDone      v = v
apply (AddL e k)  v = evald e (AddR v k)
apply (AddR v' k) v = apply k (v' + v) 

And using this we can convert evalk into a new form by replacing the continuations with their defunctionalized data type.

evald :: AExpr -> AHole -> Value
evald (Add x y) k = evald x (AddL y k)
evald (Lit i) k = apply k i

eval'' e = evald e IdDone

We can make this into more of a machine by inlining apply into evald and breaking up the tail recursion into individual steps. Now we have a step relation on a state consisting of continuation data AHole and program information AExpr. Every step makes progress towards evaluating the expression. If you squint a little, this machine is basically an RPN machine for evaluating arithmetic.

data Machine = Machine {  prog :: AExpr  , kont :: AHole}

step :: Machine -> Either Value Machine
step (Machine (Add x y) k) = Right $ Machine x (AddL y k)
step (Machine (Lit i) (AddL e k)) = Right $ Machine e (AddR i k)
step (Machine (Lit i) (AddR v k)) = Right $ Machine (Lit (i + v)) k
step (Machine (Lit i) (IdDone)) = Left i

init_machine e = Machine e IdDone

loop :: (a -> Either b a) -> a -> b
loop act x = case act x of
    Right x -> loop act x
    Left v -> v

eval'''' e = loop step (init_machine e)

Pretty neat right?

Artwork Courtesy of David

Now the next simplest steps in my opinion would be to add Booleans, Let expressions, and Print statements. Then after grokking that, I would attempt the CEK and Krivine Machines for lambda calculus.

Artwork Courtesy of David


Defunctionalizing arithmetic can be found in – Defunctionalization at Work – Danvy and Nielson Definitional Interpreters for Higher Order Programming Languages – Reynolds 1972. The grand daddy paper of defunctionalization – A Journey from Interpreters to Compilers and Virtual Machines – Mads Sig Ager, Dariusz Biernacki, Olivier Danvy,
Jan Midtgaard Best Refactoring You’ve never Heard of by Jimmy Koppel.

Xavier Leroy abstract machine slides – Leroy’s description of the Zinc Machine

CEK machine – Matt Might Neel Krishnaswami’s hypothetical compiler course.