I was hacking ever so briefly on my compiler today, and found a pretty subtle bug. Basically, free variables weren't being detected correctly which whacked my closure conversion process--resulting in IL which didn't quite do what it should have. I'm not sure that anybody would find the debugging exercise interesting, but humor me... ok? ;)
I have this simple test case:
(define (mkadder n)
(+ x n)))
(define z0 (mkadder 5))
(define z1 (mkadder 10))
(write "Should be 55: ")
(write (z0 50))
(write "Should be 60: ")
(write (z1 50))
Nothing terribly groundbreaking. I have a small # of these simple tests (~50) which aren't really tests at all. I just compile them, stash away the IL, and review and accept (or debug) any changes to the IL when I make compiler changes. Ideally, I'd have some asserts in there, but this seems to work fine for now.
As you probably noticed above, the (lambda (x) (+ x n)) contains a free variable, n, which is bound by its enclosing mkadder lambda. I represent bound closures through a pretty popular closure conversion technique.
It's pretty simple: Any lambdas with no free variables are emitted as static methods; other lambdas, however, get emitted as classes containing fields to hold the bindings of their free variables. These classes also have a constructor which requires all such bindings to be supplied, and an Apply method that takes whatever # of args the lambda needs to execute (this is the meat of the lambda).
E.g. mkadder is a static method since it doesn't contain free variables:
public static Func1<double,double> mkadder(double n);
However, the anonymous lambda inside gets its own class:
public class _lambda0 : Closure<float>
private float n;
public _lambda0(float n);
public float Apply(float x);
(A smart cookie might notice that we have a subtle issue here--that n actually gets blitted around. So when mkadder passes n to _lambda0's constructor, if _lambda0 ends up mutating it, we're not sharing the same reference, and hence mkadder wouldn't see _lambda0's updates. This is a problem since this is the correct semantics. To solve this problem, I try to detect any mutation and lift the variable out into a shared static reference; by detect, I look for either a) mutation for sure (e.g. set!), or b) can't tell (e.g. passing it off to yet another method). But I digress...)
Before calling Apply, an instance of the closure class must be constructed, at which time the values for free variables are provided, fields set, and free variables bound. (Another popular technique is to pass free variables as additional arguments to the method. I considered this, but this way seems a bit nicer for interop scenarios from OO languages (i.e. C#). It does mean that I have to construct simple objects and throw them away almost immediately, though, so once I get to the perf tuning stage I might change my mind. I could always go with both, and make the instance method just a shortcut to the static which passes the field values as the free variable arguments. This sacrifices code size, though. Never perfect, is it?)
Anyhow, I detect free variables and pick the strategy by doing a depth first traversal of my AST. During this traversal, I accumulate variable references, and for special nodes, such as a lambda which binds parameters, or a define or let which creates bindings for a precise scope, I delete variables from the ongoing free variable list. This works nicely if your traversal code works properly. :) When it doesn't, bad things happen.
I represent things like the operands to a procedure call as an IList<AstNode>. Unfortunately, my traversal code uses reflection to get at an AST node's children, and looks for both nodes directly and IEnumerable types (which contain lists of nodes). But alas, IList<T> doesn't implement the old-style IEnumerable (although it does implement IEnumerable<T>). It makes sense--doing so would require anybody implementing a new generics-style list to implement the old-style one, too. I wanted to say typeof(IEnumerable<*>).IsAssignableFrom(foo), but of course you can't easily express such things in C#... (although through some hard-to-follow reflection code you most certainly can). I could have done IEnumerable<AstNode>, but unfortunately I have nodes which use further derived types as the type argument, so this wouldn't have worked... (although using co-/contra-variant support in IL might have worked... I prefer to remain in C#-world for now).
This meant the free variable n wasn't even being picked up! Which in turn meant that my lambda was fully closed and could be emitted as a static method. Ouch. Obviously, this caused my program to fail, but the scary thing is how subtly: it compiled and even passed verification. How I emitted valid IL when I should have been missing a variable reference is beyond me at the moment. This surely will uncover a couple bugs which happen in strange edge cases like this. Lesson learned #1: When in doubt, fail loudly. Lesson learned #2: Get better test coverage. This is such a simple case, I'm surprised I just found it now.
Anyhow... What did I do to fix this? Pretty simple: I just changed my AstNodes to use List<T>. The BCL team's done the work to implement both IEnumerable and IEnumerable<T> on List<T>. I know I'll always have objects in there, so I'm not worried about the boxing overhead this would have normally incurred (although I do end up doing a lot of runtime type checking--something I need to do anyhow).
And like magic, it works! Yaay.