|
Personal Info:
Joe  leads the architecture of an experimental OS's developer platform, where
he is also chief architect of its programming language. His current mission is to enable
writing large-scale software that is reliable, secure, and scalable by-construction. Before this, Joe
founded the Parallel Extensions to .NET project.
He has been granted 19 patents, with 49 pending. When not working, Joe enjoys travelling with his wife,
writing books, writing music,
studying music theory & mathematics, and doing anything involving food & wine.
My books
My music
Disclaimer:
The content of this site are my own personal opinions and do
not represent my employer's view in anyway.
© 2012, Joe Duffy
|
|
 Sunday, February 06, 2005
I spent much of yesterday driving to Yakima Valley, hanging out for a bit, and then driving home. Round trip was about 7 hours and 380 miles, includng non-driving time. I suspect it'd be a bit more fun during the Spring- or Summer-time, as the place was (as expected) pretty quiet. In fact, it sounds like most wineries do barrel tastings around the end of April, so I think we're going to have to head back up... maybe a quick weekend trip.
I'm still a newbie to Washington, and so was pretty amazed at the scenery. You go from rainy, tropical-ish climate, through Snoqualmie Pass--which reminded me so much like driving through New Hampshire on the east coast (with real snow falling even!)--, and then land in something which feels a bit like Arizona, with very few trees and very dry weather. All in a matter of 2 1/2 hours. This was one of those times I wish I had a bike.
Threw some photos up on MSN Spaces (free disk space :) ). Here are a few:




 Thursday, February 03, 2005
During our chat yesterday, a question came up about explicit layout structs.
Overlapping Fields
In particular, somebody was wondering why overlapping reference pointers are disallowed, yet overlapping value types are entirely legal. Consider what would happen if we did allow this: accessing a reference to an instance of type a through an overlapping reference field of incorrect type b would result in very bad behavior (or worse: a value type field b whose bytes would be interepreted as a reference to who the heck knows where).
I'm not precisely sure what the runtime would do in such a circumstance (die gracefully one would hope), but I suspect its not so clearly defined -- hence the disallowance of this construct. :)
Overlapping structs are allowed since structs are just well defined sequences of bytes. One could argue this is a blemish on the CTS's type soundness (and I would agree wholeheartedly), but I'll leave that to other folks to debate.
Union<T,U>?
So anyhow, it got me wondering: could one throw together a general purpose union type using generics? I decided to try...
using System.Runtime.InteropServices;
[StructLayout(LayoutKind.Explicit)]
struct Union<T,U>
where T : struct
where U : struct
{
// fields
[FieldOffset(0)]
private bool isT;
[FieldOffset(1)]
private T tValue;
[FieldOffset(1)]
private U uValue;
// ctors
public Union(T t)
{
uValue = default(U); //shutup compiler
tValue = t;
isT = true;
}
public Union(U u)
{
tValue = default(T); //shutup compiler
uValue = u;
isT = false;
}
// properties
public bool IsT
{
get { return isT; }
}
public T TValue
{
get { return tValue; }
set { tValue = value; isT = true; }
}
public U UValue
{
get { return uValue; }
set { uValue = value; isT = false; }
}
}
This enables you to do fancy unions with certain kinds of structs, using just one wasted byte at the beginning for the bool isT. It'd be nice if you could do reference types, too, but unless you can guarantee nobody will ever try to access, say, uValue when isT is true, it ain't gonna happen. Further, even with structs this wouldn't work if T or U had at least one reference type instance field for the same reasons outlined above -- you could imagine bad things happening if you were allowed to access a “corrupt” pointer, basically just some random bytes which make up a value type instance interpreted as a pointer to a memory location (ouch).
Looks great, right? Well, minus all of the aforementioned caveats. I marvelled at the beauty of this code. What a clever chap I am, I told myself. It even compiled!
Not So Good News
Unfortunately we don't allow execution of even this watered down version.
Why?
Because of the use of generics.
As I said, compilers permit it (at least C# does), but it won't pass our verifier and thus causes a TypeLoadException if you try to use the generated IL. In this case, it should be obvious that we could correctly lay out the struct in memory when the type is being JITted. However, the behavior here -- even if it passed verification -- simply isn't defined at all. Further, my guess is that there are some JIT optimizations (e.g. eager struct size computation that might not be generics-aware yet) that would be thrown off if we just permitted this to get through. Not that it isn't possible, just that we haven't spent the time to enable it... probably because of the relative obscurity (leave it to me to find the obscure things). :)
The behavior is entirely deterministic statically and thus at runtime because we only place the parameterized types at the end of the struct. Once the type is closed and we know the type arguments, we can easily compute the size (e.g. 1 + max(sizeof(T), sizeof(U))). If we had fields after the parameterized types, however, it'd be impossible to specify the right offsets statically and thus we'd run into problems. Although, we could still easily determine the right amount of storage at JIT time. One could imagine a [FieldOffset(max(sizeof(T),sizeof(U))+x)] construct that made this possible.
It's a shame. A general purpose Union<T,U> type would be pretty damn cool.
 Saturday, January 22, 2005
Having a massive library of techie books is a blessing in disguise... Especially when you're writing a book and need constant reminder of how things really work. Or just a little inspiration.
I have these on my desk right now (listed in no particular order):
Notice that I have quite a few Java books. Interestingly, I own very few CLR/.NET Framework books... I've found that most of the Java material is transferrable. I'd recommend any one of the above books very highly.
So a while back I thought I'd be clever and post something entitled “Laptxp Pxrnx” (where x = o). It was completely innocent and simply depicted my three laptops strutting their stuff. They typically do not wear clothing and thus I had no compunction about posting photographs of them in their natural state.
Well, just recently I've begun to get a flurry of hits from Google (mostly from image search), and I'm confident that the massive amount of referral spam I get is at least somewhat related. Moreover, I am actually frightened by some of the search terms I see in the referrals.
Out of this I have distilled a moral: Avoid using naughty terms, even when you meant no harm. Google does not know intent, and dirty dirty people will stumble across your website. :)
(Btw: I am probably just improving the pxrnx post's page rank by linking to it above... ugh...)
Spent a little time hacking on forward interop between Sencha and the Framework, that is enabling Scheme programs to call into existing libraries. For now, I settled on a new callp function (stands for “call platform”) which does some heavyweight reflection-based lookup and invocation. It takes a varying number of arguments, and if the best match is non-static, the first argument is used as the target of the invocation. The assemblies to search are defined at the command line, similar to C#'s /reference:x switch.
As an example of a few simple calls:
(callp "Console.WriteLine" "Hello, World!") -> #t (callp "Math.Max" 32.5 59.2) -> 59.2 (let ((stream (callp "System.IO.StreamReader.ctor" (callp "System.IO.File.OpenRead" "somefile.txt")))) (write (callp "System.IO.StreamReader.ReadToEnd" stream)) (callp "System.IO.StreamReader.Close" stream)) -> #t
A bit verbose, yes, but I'm trying to avoid extending the language and instead extending the standard library. The “-> x” part is what the result of the expression evaluates to. Any methods with void returns translate into true, and a failure to locate a method translates into false. Admittedly, not optimal (for example with legitimate boolean return types where false is semantically meaningful), but it's at least got me cooking with gas for now.
 Friday, January 21, 2005
In Whidbey, we have a few great changes to delegates, two of which are particularly cool for languages of all sorts.
First, we have unbound delegates. These enable you to new up a delegate without having to supply an object instance at creation time. You just provide the method handle as you would with a static method, for example, and bind it lazily to an instance at invocation time. Interestingly, trying to pass null as the object pointer in v1.1 would die with a NullReferenceException.
C++/CLI has language syntax to support unbound delegates, but C# unfortunately does not. This feature is great for functional language-like algorithms and was initially conceived of to support STL.NET. As an example, say you have a collection of homogenous objects and want to apply an instance function against each object in the set. To do this generically today, you'd have to use the reflection APIs, admittedly a little less nice than the C++ syntax. Now with unbound delegates, the code which iterates over the set's contents and does the invoking just supplies the target pointer as it calls invoke.
Another cool feature is relaxed delegates. These enable you to bind to functions using covariant return and contravariant parameter types, and are in fact supported by C#. You get this feature for free and don't even need to change anything to take advantage of it. As an example of its use, consider this class hierarchy:
class A {} class B : A {} class C : B {}
And this delegate:
delegate B f(B b);
In v1.1, the only valid method signature to which you could refer would have to have exact parameter and return types, e.g. as in
B g(B b);
Now in v2.0 you can bind to properly variant methods, too, e.g. as in
B h(A a); C h(B b); C h(A a);
Based on the type hierarchy defined above, C is covariant with respect to B, and thus can be substituted for the return type; conversely, A is contravariant with respect to B, and thus can be used as the parameter type. Any combination of this variance is allowed. The following is not valid, however, as we're going the opposite direction (i.e. contravariant return, covariant parameters):
A h(C c);
Out and ref parameters continue to be treated as invariant for delegates, as do generic type parameters.
Now just to get co- and contravariance built into the runtime's type system. :)
 Thursday, January 20, 2005
Sweet. Tail calls were a bit easier to implement than I had expected. Still not 100%, but getting close.
At first, this was pretty damn difficult since I treat the result of evaluating a lambda as a delegate. E.g. a binding always ends up treating a variable bound to a lambda as a typed delegate (which points to the function that was generated). Unfortunately, delegates can't be tail called in the CLR.
So my first change was to start using the raw method handles instead of delegates where possible. This was an optimization I had to make anyhow, and it's had benefits elsewhere in the compiler that I just got for free. Then I changed my letrec implementation so that it directs the binding information down the lambda's AST before emitting its code. This way the lambda knows what it's being bound to (if anything), and adds it to a psuedo environment when generating its body.
I used to set the environment up for the lambda eagerly, but once I start referring recursively to the lambda being bound, it gets to be quite difficult! I had originally thought I can emit dummie calls to a static void NoOp() function and do some backpatching afterwards, but it turns out Reflection.Emit doesn't allow you to change the IL you've already emitted. So I wound up with the syntax-directed design.
Anyhow, I need to write up some good whitepapers on this stuff. Bottom line, this Scheme program:
(let ((fact2 (lambda (n v) (if (> n 0) (fact2 (- n 1) (* v n)) v))) (fact (lambda (n) (fact2 n 1)))) (fact 1000000))
Used to bomb out with a StackOverflowException. Now it runs beautifully... well, except for the fact that I don't have bignum support and hence the result is “Infinity”. Nonetheless, it was a straightforward excercise.
I'm ashamed.
Well, the good news is, Sencha compiles simple programs like this:
(let ((sq (lambda (x) (* x x))) (dbl (lambda (x y) (x (x y))))) (dbl sq 10))
Which demonstrates nothing but good old environment modification and free variable binding.
But it also compiles this:
(letrec ((fib (lambda (x) (if (<= x 1) 1 (+ (fib (- x 1)) (fib (- x 2))))))) (fib 12))
Which demonstrates, of course, recurisve let bindings. This isn't what I'm ashamed of.
What I am ashamed of is the nasty IL that gets generated from the latter.
First, the top-level execution:
.method public hidebysig static void Main(string[] A_0) cil managed { .entrypoint // Code size 69 (0x45) .maxstack 3 IL_0000: newobj instance void __lambda0::.ctor() IL_0005: dup IL_0006: ldvirtftn instance object __lambda0::Apply1(object) IL_000c: newobj instance void class [SenchaRuntimeLibrary]Sencha.Runtime.'Func1`2'<object,object>::.ctor(object, native int) IL_0011: stsfld class [SenchaRuntimeLibrary]Sencha.Runtime.'Func1`2'<object,object> Program::fib IL_0016: ldsfld class [SenchaRuntimeLibrary]Sencha.Runtime.'Func1`2'<object,object> Program::fib IL_001b: dup IL_001c: ldc.i4.1 IL_001d: call void [SenchaRuntimeLibrary]Sencha.Runtime.RuntimeHelper::AssertCompatableFunctionType(object, int32) IL_0022: castclass class [SenchaRuntimeLibrary]Sencha.Runtime.'Func1`2'<object,object> IL_0027: ldc.r8 12. IL_0030: box [mscorlib]System.Double IL_0035: call instance !0 class [SenchaRuntimeLibrary]Sencha.Runtime.'Func1`2'<object,object>::Invoke(!1) IL_003a: call string [SenchaRuntimeLibrary]Sencha.Runtime.RuntimeHelper::ToString(object) IL_003f: call void [mscorlib]System.Console::WriteLine(string) IL_0044: ret } // end of method Program::Main
Notice the obvious areas for optimization... For example, if I construct a new Func1`2 delegate right before I call it... well... do I really need all that crap about reloading and type checking? Likely not.
But it gets worse. Check out the actual fib function IL:
.method public hidebysig virtual instance object Apply1([in] object x) cil managed { .override method instance !0 class [SenchaRuntimeLibrary]Sencha.Runtime.'Closure1`2'<object,object>::Apply1(!1) // Code size 189 (0xbd) .maxstack 10 .locals init ([0] object[] V_0, [1] object[] V_1, [2] object[] V_2) IL_0000: ldarg.1 IL_0001: ldc.r8 1. IL_000a: box [mscorlib]System.Double IL_000f: call bool [SenchaRuntimeLibrary]Sencha.Runtime.StandardSchemeFunctions::op_LessThanOrEqual(object, object) IL_0014: box [mscorlib]System.Boolean IL_0019: call bool [SenchaRuntimeLibrary]Sencha.Runtime.StandardSchemeFunctions::IsTrue(object) IL_001e: brfalse.s IL_0033 IL_0020: ldc.r8 1. IL_0029: box [mscorlib]System.Double IL_002e: br IL_00bc IL_0033: ldc.i4.1 IL_0034: newarr [mscorlib]System.Object IL_0039: stloc V_0 IL_003d: nop IL_003e: nop IL_003f: ldsfld class [SenchaRuntimeLibrary]Sencha.Runtime.'Func1`2'<object,object> Program::fib IL_0044: dup IL_0045: ldc.i4.1 IL_0046: call void [SenchaRuntimeLibrary]Sencha.Runtime.RuntimeHelper::AssertCompatableFunctionType(object, int32) IL_004b: castclass class [SenchaRuntimeLibrary]Sencha.Runtime.'Func1`2'<object,object> IL_0050: ldc.i4.1 IL_0051: newarr [mscorlib]System.Object IL_0056: stloc V_1 IL_005a: nop IL_005b: nop IL_005c: ldarg.1 IL_005d: ldloc.1 IL_005e: ldc.i4.0 IL_005f: ldc.r8 1. IL_0068: box [mscorlib]System.Double IL_006d: stelem.ref IL_006e: ldloc.1 IL_006f: call object [SenchaRuntimeLibrary]Sencha.Runtime.StandardSchemeFunctions::op_Sub(object, object[]) IL_0074: call instance !0 class [SenchaRuntimeLibrary]Sencha.Runtime.'Func1`2'<object,object>::Invoke(!1) IL_0079: ldloc.0 IL_007a: ldc.i4.0 IL_007b: ldsfld class [SenchaRuntimeLibrary]Sencha.Runtime.'Func1`2'<object,object> Program::fib IL_0080: dup IL_0081: ldc.i4.1 IL_0082: call void [SenchaRuntimeLibrary]Sencha.Runtime.RuntimeHelper::AssertCompatableFunctionType(object, int32) IL_0087: castclass class [SenchaRuntimeLibrary]Sencha.Runtime.'Func1`2'<object,object> IL_008c: ldc.i4.1 IL_008d: newarr [mscorlib]System.Object IL_0092: stloc V_2 IL_0096: nop IL_0097: nop IL_0098: ldarg.1 IL_0099: ldloc.2 IL_009a: ldc.i4.0 IL_009b: ldc.r8 2. IL_00a4: box [mscorlib]System.Double IL_00a9: stelem.ref IL_00aa: ldloc.2 IL_00ab: call object [SenchaRuntimeLibrary]Sencha.Runtime.StandardSchemeFunctions::op_Sub(object, object[]) IL_00b0: call instance !0 class [SenchaRuntimeLibrary]Sencha.Runtime.'Func1`2'<object,object>::Invoke(!1) IL_00b5: stelem.ref IL_00b6: ldloc.0 IL_00b7: call object [SenchaRuntimeLibrary]Sencha.Runtime.StandardSchemeFunctions::op_Add(object, object[]) IL_00bc: ret } // end of method __lambda0::Apply1
Ouch! You mean I have to generate verifiable code and optimize it?!? As each day passes, I respect commercial compiler teams a little more.
In case you're wondering, the roughly equivalent C# code would be:
using System;
class Program {
delegate double fibfunc(double x); static fibfunc fib;
static void Main() { fib = delegate (double x) { if (x <= 1) return 1; else return fib(x - 1) + fib(x - 2); }; Console.WriteLine(fib(12)); }
}
Believe me -- the IL generated is a bit nicer. :)
 Monday, January 17, 2005
Ouch. Them's are fightin' words. Break out the steer.
|
|
Recent Entries:
Search:
Browse by Date:
| | Sun | Mon | Tue | Wed | Thu | Fri | Sat | | 30 | 31 | 1 | 2 | 3 | 4 | 5 | | 6 | 7 | 8 | 9 | 10 | 11 | 12 | | 13 | 14 | 15 | 16 | 17 | 18 | 19 | | 20 | 21 | 22 | 23 | 24 | 25 | 26 | | 27 | 28 | 1 | 2 | 3 | 4 | 5 | | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
Browse by Category:
Notables:
|