Source Transformation for Fun and Profit (by alaric)
Back when I was a kid, I designed a low-level macro system for purely functional Lispy languages, based around a cunning use of partial evaluation. I called it meta
.
Since I was a dreadfully lazy student, when I had to do a final year project, I suggested it as an idea and then pretended to think of and develop it all over again, before spending far too little time writing up a report on it, and a sample implementation. Which may have been why the sample implementation broke in the demonstration...
But at the time, I didn't think much of macro hygiene. All I'd seen of it was that Scheme at the time had a hygienic macro system called syntax-rules that, as far as I could tell, sucked - it was limited to basic-seeming rule-based substitutions, and did not use the full power of the Scheme language as a macro language.
However, things have changed since then. The Scheme community has come up with hygienic macro systems that let you write macros in Scheme, such as syntactic closures. And so I've found that hygiene is a desirable property, rather than a terrible design tradeoff.
So, I wonder to myself, can my meta
macro system be brought up to date and incorporate hygiene?
I wondered a bit if meta
could be ported to Scheme, but I have a hunch it would be a bit difficult to incorporate with Scheme's widespread support for mutation. Eager partial evaluation at compile time won't sit too well with mutable bindings, for the simple reason that it's easy to redefine the meaning of things like +
in Scheme - at run time.
So let's just consider this a design exercise for CHROME for now, rather than worrying about Scheme. In CHROME, I needn't worry about mutations, since it's purely functional. Mutations exist, but everything's still referentially transparent, thanks to uniqueness types.
For a start, I'll describe how meta
works.
Partial evaluation
A program in a purely functional language is, as the name suggests, a pure function. That is, it's an explicit list of inputs, which are processed in some way to produce an explicit list of outputs. The only other inputs to the program are references to global declarations of things, which cannot be updated. There are no other inputs to or outputs from the program. And the program can be broken down into subexpressions, most of which are function calls, which reference other pure functions - that have exactly the same property.
In other words, there's no way for a function to perform any I/O other than through its arguments, global constants, and return value. No printf
. No assigning to a global variable. Etc.
Now, imagine a function like this (using my proposed syntax, to give it a test drive):
[defun [fibonacci n]
[if: [< n 2]
then: 1
else: [+ [fibonacci [- n 1]] [fibonacci [- n 2]]]]]
[defun [fib-even n]
[if: [even? n]
then: [fibonacci n]
else: [fibonacci 15]]]
If you call fib-even
on an odd number, it always returns the fifteenth fibonacci number. If you call it on an even number, it returns that fibonacci number.
Now, a partially evaluating compiler, when presented with the definition of fib-even
, will see how much of it can be precomputed.
[even? n]
can be partially precomputed; the even?
symbol can be looked up in the environment, and the application replaced with [apply: ... to: (n)]
, with ...
replaced by the unprintable function object itself. Which, in itself, makes little difference. But since we don't have the defintion of even?
to hand and n
is not known in advance - it's a function argument - the application cannot be performed yet, so it can go no further. If the expression could be reduced to a constant - for example, if it was [even? 2]
, which would reduce to #t
- then the if
could be performed at compile time. But it can't so the best we can do is to try and reduce its arms as far as we can.
The then
arm can only be slightly reduced, too, by resolving the symbol reference to fibonacci
. n
is still unknown, but unlike even?
, we have the definition of fibonacci
to hand (in fact, it's what the symbol resolved to). We now have an application of a known function to an unknown value - which can be reduced further, by inlining the body of fibonacci
. We have to replace all occurrences of the argument name in fibonacci with the value the function is being applied to, but conveniently, in this case, it's just a reference to the symbol n
anyway.
However, the else
arm can be reduced. It's an application of a known function to a known value. The compiler can apply it there and then, and produce the fifteenth Fibonacci number - 92, I think. Which can then be left there as a literal constant.
That's partial evaluation. It's well known stuff, but it's the foundation of meta
.