I have about 75% of the standard R5RS functions implemented now. I chose to write the runtime library in C#, primarily because I wanted to make heavy use of the existing Framework and haven't shored up my Scheme<-{OtherIL} interop just yet. It's amazing how simple the mapping was. Most Scheme functions ended up being one- or two-liners that just marshalled through to existing APIs. Man, I love the Framework.
When I say {OtherIL}, I mean simply accessing C#-compiled code from Scheme, for example. It's trickier than it sounds. First, I need to figure out how to do lookups and bind to exiting .NET functions. I.e. do you simply access by the class's FQN and then search all referenced assemblies at compile-time for a match? Require assembly annotations? Introduce non-Scheme-compliant syntax? And so on. For the Scheme library functions, I have a controlled manner of adding them to the environment and accessing them.
Moreover, I can (and do!) make a lot of assumptions about consuming code I've compiled since... well... I compiled it. For example, all of the standard Scheme functions leave something on the stack after executing. (I.e. non-void return types.) Scheme differentiates between commands and expressions, the latter of which always return a value. Once I introduce methods with void return types, a lot of things will stop working, including a small feature called verifiability.
Anyhow, I've chosen to use the new Whidbey LinkedList<T> class for lists. This means that when you do a (cons x y), you're really emitting a new LinkedList in the background. (car x) will simply return the head of the list, while (cdr x) will return the portion of the list following the head. It's unfortunate, but I had to write a "sublist" function myself since the Framework doesn't have one. I chatted with Krzysztof about this, but it seems this is just the way the cookie crumbles! It was relatively simple anyhow, although not entirely trivial since a node maintains an affinity to a particular list (so new LinkedList<T>(otherList.Head.Next) won't work).
private static LinkedList<T> CloneSubList<T>(LinkedListNode<T> node)
{
LinkedList<T> newList = new LinkedList<T>();
while (node != null)
{
newList.AddTail(node.Value);
node = node.Next;
}
return newList;
}
So lists are surprisingly working almost perfectly with very little work! Even the 30-odd permutations of cadr, cddar, cdadadr, cdaaaddddaaddadddaaddr, and so on. :) I really wish there were better sublist support, though, as this is a horribly inefficient way to do business. Not to mention that I think the semantics will break down at some point, for example when using calls such as (set-cdr! (cdr x) y), as it mutates the copy of the list, not the original one. Still a little work to do here. Damn all of the x! methods in Scheme! They make things just a little bit trickier... should've stuck to a Haskell implementation.
Finally, after ratholing a bit on trying to static type check (solely through inferencing), I chatted with Jim a bit on Tuesday. Python is, along with Scheme, almost always purely dynamically typed, so I was curious what he did with IronPython. Seems like he does some interesting optimizations such as pooling boxed value types, but has a similar approach to mine. As I noted earlier, this means a hell of a lot of boxing and unboxing, along with casts galore to make the verifier happy. One snippet of ridiculous IL that gets created for an if statement is as follows.
This is the Scheme code:
(if #t
'success
'failure)
And this is the IL:
.method public hidebysig static void Main(string[] A_0) cil managed
{
.entrypoint
// Code size 43 (0x2b)
.maxstack 2
IL_0000: ldc.i4.1
IL_0001: box [mscorlib]System.Boolean
IL_0006: unbox [mscorlib]System.Boolean
IL_000b: ldind.i1
IL_000c: brfalse IL_001b
IL_0011: ldstr "success"
IL_0016: br IL_0020
IL_001b: ldstr "failure"
IL_0020: call string [SenchaRuntimeLibrary]Sencha.Runtime.RuntimeHelper::ToString(object)
IL_0025: call void [mscorlib]System.Console::WriteLine(string)
IL_002a: ret
} // end of method Program::Main
See any opportunity for optimization? Anyhow, this is a bit tangential, and is just a simple optimization I need to work on. The more worrysome fact is that I've created strongly typed overloads for all of the Scheme library functions, but can't use them right now! For example, I have:
[SchemeFunction("+")]
public static object op_Add(object z1, object[] zn);
[SchemeFunction("+")]
public static double op_Add(double z1, double[] zn);
Currently, I only ever bind to the first one. Which means stupid looking calls such as the following compilation.
Scheme:
(+ 1 2 3 4 5)
IL:
.method public hidebysig static void Main(string[] A_0) cil managed
{
.entrypoint
// Code size 111 (0x6f)
.maxstack 4
.locals init ([0] object[] V_0)
IL_0000: ldc.i4.4
IL_0001: newarr [mscorlib]System.Object
IL_0006: stloc V_0
IL_000a: nop
IL_000b: nop
IL_000c: ldc.r8 1.
IL_0015: box [mscorlib]System.Double
IL_001a: ldloc.0
IL_001b: ldc.i4.0
IL_001c: ldc.r8 2.
IL_0025: box [mscorlib]System.Double
IL_002a: stelem.ref
IL_002b: ldloc.0
IL_002c: ldc.i4.1
IL_002d: ldc.r8 3.
IL_0036: box [mscorlib]System.Double
IL_003b: stelem.ref
IL_003c: ldloc.0
IL_003d: ldc.i4.2
IL_003e: ldc.r8 4.
IL_0047: box [mscorlib]System.Double
IL_004c: stelem.ref
IL_004d: ldloc.0
IL_004e: ldc.i4.3
IL_004f: ldc.r8 5.
IL_0058: box [mscorlib]System.Double
IL_005d: stelem.ref
IL_005e: ldloc.0
IL_005f: call object [SenchaRuntimeLibrary]Sencha.Runtime.StandardSchemeFunctions::op_Add(object,
object[])
IL_0064: call string [SenchaRuntimeLibrary]Sencha.Runtime.RuntimeHelper::ToString(object)
IL_0069: call void [mscorlib]System.Console::WriteLine(string)
IL_006e: ret
} // end of method Program::Main
Gawd, that hurts the eyes! It's quite obvious I can inference in this case (as everything's a ldc.x), but it begins to break down quick enough.
His advice was to forego any attempts at type inferencing, or even the thought of adding static type annotations, until the purely dynamic version is working. I had started to come to the same conclusion after spending a lot of time on this. This is a tough problem to solve, even for languages that have static type annotations!
To finish the post off, here is a set of simple tests that now work correctly. I had on the order of 30 tests functioning now. Only about 150 to go. :)
(number? 35.05)
=> #t
((lambda (x y)
(x y))
(lambda (x)
((lambda (x) (* x y))
x))
3)
=> 9
(let ((gle (lambda (x y)
(cond ((> x y) 'greater)
((< x y) 'less)
(else 'equal)))))
(gle 3 10))
=> less
(car (cons 3 5))
=> 3
(cadr (cons 3 5))
=> 5