A three-stage program you definitely want to write

Writing programs explicitly in stages gives you guarantees that abstraction will be removed. A guarantee that the optimiser most certainly does not give you.

After spending the majority of my early 20s inside the optimiser, I decided enough was enough and it was time to gain back control over how my programs were partially evaluated.

So in this post I’ll give an example of how I took back control and eliminated two levels of abstraction for an interpreter by writing a program which runs in three stages.

Enter: An applicative interpreter for Hutton’s razor.

data Expr = Val Int | Add Expr Expr

eval :: Applicative m => Expr -> m Int
eval (Val n) = pure n
eval (Add e1 e2) = (+) <$> eval e1 <*> eval e2

Written simply at one level, there are two levels of abstraction which could be failed to be eliminated.

  1. If we statically know the expression we can eliminate Expr.
  2. If we statically know which Applicative then we can remove the indirection from the typeclass.

Using typed Template Haskell we’ll work out how to remove both of these layers.

Eliminating the Expression

First we’ll have a look at how to stage the program just to eliminate the expression without discussion the application fragment. This is a two-stage program.

module Two where

import Language.Haskell.TH

data Expr = Val Int | Add Expr Expr

eval :: Expr -> TExpQ Int
eval (Val n) = [|| n ||]
eval (Add e1 e2) = [|| $$(eval e1) + $$(eval e2) ||]

The eval function takes an expression and generates code which unrolls the expression that needs to be evaluated.

Splicing in eval gives us a chain of additions which are computed at run-time.

$$(eval (Add (Val 1) (Val 2)))
=> 1 + 2

By explicitly separating the program into stages we know that there will be no mention of Expr in the resulting program.

Eliminating the Applicative Functor

That’s good. Eliminating the Expr data type was easy. We’ll have to work a bit more to eliminate the applicative.

In the first stage, we will eliminate the expression in the same manner but instead of producing an Int, we will produce a SynApplicative which is a syntactic representation of an applicative. This allows us to inspect the structure of the program in the second stage and remove that overhead as well.

data SynApplicative a where
  Return :: WithCode a -> SynApplicative a
  App  :: SynApplicative (a -> b) -> SynApplicative a -> SynApplicative b

data WithCode a = WithCode { _val :: a, _code :: TExpQ a }

WithCode is a wrapper which pairs a value with a code fragment which was used to produce that value.

If you notice in the earlier example, this wasn’t necessary when it was known that we needed to persist an Int, as there is a Lift instance for Int. However, in general, not all values can be persisted so using WithCode is more general and flexible, if a bit more verbose.

elimExpr eliminates the first layer of abstraction and returns code which generates a SynApplicative.

elimExpr :: Expr -> TExpQ (SynApplicative Int)
elimExpr (Val n) = [|| Return (WithCode n (liftT n)) ||]
elimExpr (Add e1 e2) =
   [|| Return (WithCode (+) codePlus)
        `App` $$(elimExpr e1)
        `App` $$(elimExpr e2) ||]

liftT :: Lift a => a -> TExpQ a
liftT = unsafeTExpCoerce . lift

codePlus = [|| (+) ||]

In the case for Add we encounter a situation where we would have liked to use nested brackets to persist the value of [|| (+) ||]. Instead you have to lift it to the top level and then persist that identifier.

Next, it’s time to provide an interpreter to remove the abstraction of the applicative. In order to do this, we need to provide a dictionary which will be used to give the interpretation of the applicative commands.

data ApplicativeDict m =
  ApplicativeDict
    { _return :: (forall a . WithCode (a -> m a)),
      _ap     :: (forall a b . WithCode (m (a -> b) -> m a -> m b))
    }

WithCode is necessary again as it will be used to generate a program so it’s necessary to know how to implement the methods.

elimApplicative
  :: SynApplicative a
  -> ApplicativeDict m
  -> TExpQ (m a)
elimApplicative (Return v) d@ApplicativeDict{..}
  = [|| $$(_code _return) $$(_code v) ||]
elimApplicative (App e1 e2) d@ApplicativeDict{..}
  = [|| $$(_code _ap) $$(elimApplicative e1 d) $$(elimApplicative e2 d) ||]

This interpretation is very boring as it just amounts to replacing all the constructors with their implementations. However, it is exciting that we have guaranteed the removal of the overhead of the applicative abstraction.

Running the Splice

Now that we’ve written two functions independently to to eliminate the two layers, they need to be combined together. This is the birth of our three-stage program.

import Three

elim :: Identity Int
elim = $$(elimApplicative $$(elimExpr (Add (Val 1) (Val 2))) identityDict)

identityDict = ApplicativeDict{..}
  where
    _return = WithCode Identity [|| Identity ||]
    _ap = WithCode idAp [|| idAp ||]

idAp :: Identity (a -> b) -> Identity a -> Identity b
idAp (Identity f) (Identity a) = Identity (f a)

elim is the combination of elimApplicative and elimExpr. The nested splices indicate that the program is more than two levels.

Using -ddump-splices we can have a look at the program that gets generated.

Test.hs:10:30-59: Splicing expression
    elimExpr (Add (Val 1) (Val 2))
  ======>
    ((Return ((WithCode (+)) codePlus)
        `App` Return ((WithCode 1) (liftT 1)))
       `App` Return ((WithCode 2) (liftT 2)))
Test.hs:10:11-73: Splicing expression
    elimApplicative $$(elimExpr (Add (Val 1) (Val 2))) identityDict
  ======>
    (idAp ((idAp (Identity (+))) (Identity 1))) (Identity 2)

Both steps appear in the debug output with the code which was produced at each step. Notice that we had very precise control over what code was generated and that functions like idAp are not inlined. In this case, the compiler will certainly inline idAp and so on but in general it might be useful to generate code which contains calls to GHC.Exts.inline to force even recursive functions to be inlined once.

Conclusion

In general, splitting your program up into stages is quite difficult so mechanisms like type class specialisation will be easier to achieve. In controlled situations though, staging gives you the guarantees you need.