| |
 Wednesday, March 30, 2005
I mentioned it a while back, but I have some level of platform interop working with Sencha. By “platform interop,” I mean writing code that uses functions defined elsewhere (i.e. not Scheme built-ins and not custom written stuff). The gunk that enables this usually ends up making a best guess at binding, in some cases performing operations to bridge the type system gap that exists.
One of the interesting things I noticed along the way was how easy it is to work with XML inside Scheme. I do some marshalling back and forth using factories so that, when you're in Scheme, you see S-expressions which can be processed and transformed as ordinary lists of data. When you're using Framework APIs that expect XmlReaders, Documents, and the like, however, they see what they expect. It's admittedly dangerous to perform this style of conversion implicitly, but for the time being it's been the source of some fun experimentation.
For example, generating SOAP is quite simple, etc.:
'(Envelope (:ns s) (@xmlns:s "http://www.w3.org/2003/05/soap-envelope") (@xmlns:wsa "http://schemas.xmlsoap.org/ws/2004/08/addressing") (@xmlns:f123 "http://www.fabrikam123.example/svc53") (Header (:ns s) (MessageID (:ns wsa) "uuid:aaaabbbb-cccc-dddd-eeee-ffffffffffff") (ReplyTo (:ns wsa) (Address (:ns wsa) "http://business456.example/client1")) (To (@mustUnderstand "1") (:ns wsa) "mailto:joe@fabrikam123.example") (Action (:ns wsa) (@mustUnderstand "1") "http://fabrikam123.example/mail/Delete")) (Body (:ns s) (Delete (:ns f123) (maxCount 42))))
Similar to what's possible in C-omega, quasiquotations enable you to embed calculations in the message. For example, the body node could have been:
'(Body (:ns s) ,(generateSoapBody ,42))
Which has the nice effect of substituting the return value of the generateSoapBody function, passing 42 as its argument.
As I said, the greatest thing about this is that you can use all of the list processing techniques that Lisp langauges are good for, existing libraries, and so on, and then easily convert the result back into XML. Parsing, schema and namespace validation, resource resolution (e.g. DTDs) is all done for you by the existing System.Xml Framework libraries.
I know that the linkage between the two technologies has been observed in the past, but it seems like there's a lot of room for innovation in the future. Update: just noticed this page. Some interesting stuff.
 Saturday, March 26, 2005
I can't imagine what it would have been like to write a CLR/.NET Framework 1.0 book before it was released from the outside. That is, while not working for Microsoft.
I just upgraded from Whidbey Beta1 to PD7, and found a ton of interesting changes. Mostly API renames which, thankfully, are done through obsoleting the old one. This makes migrating much smoother since the build doesn't break, and I can just follow up on the warnings as needed. I have to be better about keeping up from drop to drop. Even without doing that, though, every so often I get wind of a change via email or word of mouth, and make a note to check it out at some point in the future. I basically have a whole set of Outlook folders, one for each chapter, with nearly one hundred todo's... One has to wonder if I'll ever be able to follow up on them all. :)
Luckily, I also have a whole suite of test cases which I am using to verify when something changes. Basically, anything I write about, I write a test case for. This has definitely been one of the best decisions I made towards the beginning of the project. This at least helps to identify areas where the text might now be out of date, and points directly to code samples which are obsolete.
Simon's "how to write a great research paper" material is simply great. I think I need to follow his advice, particularly about starting to write early and often.
I've countless ideas bouncing inside my head, most of which will simply get lost, but many of which I truly believe should see the light of day. I think I'm a bit of an over-thinker, and tend to wait until my ideas sufficiently mature before expressing them. It's weird, but I guess I've always been like that. The problem is, I have so much stuff that goes on in a typical day, so many ideas coming and going, that I'm not able to mature those ideas as I would like. Most of it just gathers dust and eventually dissapears. I suspect a lot of people have this problem.
Ideas are cheap.
There's a significant difference between an idea and an implementation of that idea. This, too, is where I get caught up. Sure I could come up with some nice theoretical goop, but if there's no implementation, what the hell good is it? Somebody might read it someday, but likely not. But unfortunately, implementing an idea is much more costly than simply thinking it. And I tend to get bored pretty quickly anyhow, instead preempting progress in order to gravitate towards better ideas that I perceive to be more worth my time.
It seems to me that the most successful people in our society are those that can follow through and execute on mediocre-to-good ideas. There's much more value in being able to implement an idea (be it your own or somebody elses) than actually thinking it in the first place.
Open source seems to be pretty helpful in solving this problem. Similar to how a senior researcher will often be given a group of folks to implement his/her ideas, if you can start an open source project and convince a bunch of people it's worthwhile you can acheive much more in a shorter period of time than doing it by yourself. Of course, commercial environments are much like this, too, but it's often more difficult to sell ideas in a corporate environment. In a smaller business, you're likely to have a better chance. Starting a start-up to implement your idea is another great option, one that's usually significantly most costly than any of the others above.
I mention all of this without discussing goals. What is the goal of implementing an idea anyhow? For some, I suppose it's money. A good implementation of a good idea probably will get some level of recognition, and with clever distribution models for that idea, you could make some cash. But for a lot of people, I think it's laying the foundation for future ideas. As Simon puts it, infecting your audience with your idea. So when I mention “benefit,“ it's highly subjective based on whether you're a greedy bastard or a naive idealog (of course there are other choices, too ;)).
How does one evaluate the cost/benefit of following through on an idea? I wonder if there are any decent models out there that would help. I suppose choosing between ideas is a bit like branching between possible nodes in a graph in elementary utility theory. Simply calculate the probability of a given outcome, cost of traversing the node, and perceived benefit, and simply seek to minimize cost and maximize benefit. Seems so simple. The inherent lack of structure in human thought troubles me. The fact that I put up with it even more so.
Random.
Back to writing my book.
 Friday, March 25, 2005
From Paul Graham's recent post
I think you should say "College is where faking starts to stop working."
This statement is up for interpretation. But I at least think I understand the gist.
I've always been fascinated with academia at least partially for this reason, and I've noticed a progressing desire as I age to be in environments where “faking it” is less and less tolerable. Microsoft is certainly more of a meritocracy, most academic organizations even more so. (Although, from what I hear, tenure tends to spoil this effect in many cases.) I spend countless hours of recreational time in order to improve my own thinking, mostly by studying formal theories and interesting works, but also through exercises such as writing compilers and learning new programming languages and ideas. It frustrates me to no end when somebody can get by, not on the merit or accuracy of their ideas, but rather due to politics or the networking effect of a bunch of fakers who can't recognize the difference.
But I am also very cognizant of what I do not know or understand. This, I think, is equally as important to knowing a lot. Well, at least in a person interesting in contributing at least one idea of importance to a field. Interestingly, Simon makes this same point in his “how to write a good research paper” talk referenced here. He advises that the process of understanding what you do not know helps you to better focus your research activities to fill in the gaps. I couldn't agree more. (In my limited experience.) The sooner you identify these areas, the sooner you can make progress.
I also find it interesting that, through the act of filling knowledge gaps, the foundational knowledge built up in your head seems to shift, sometimes causing other areas to become obsolete. This tends to create new gaps which need to be re-filled in the context of the new shifted foundation, but also creates new and interesting connections and clarifications on previously faulty mental models (which you might have thought were accurate, and which still likely aren't). But it's a beautiful, iterative process. No room for faking. And it certainly never ends.
 Wednesday, March 23, 2005
This can probably fall into the “paranoid programmer“ category... where this time the paranoia is not about async exceptions, but rather about pulling in the JIT unnecessarily.
Back in September, we did a fair amount of work documenting and getting FxCop rules in place to check when the use of generics can cause an NGen'd assembly to JIT. This was mostly in response to our general no-JIT plan that most managed code we ship is on. In particular, Avalon drove us hard to come up with this.
Joel has a great entry about code sharing vis-a-vis generics over on his blog. I'd read that alongside this.
(BTW, in re-reading my DG update entry before posting, I think there's some significant clarification and re-work that we could/should do. Add it to the growing stack of things to do! :))
Generics and Performance
- Do consider the performance ramifications of generics. Specific recommendations arise from these considerations and are described in guidelines that follow.
Execution Time Considerations
- Generic collections over value types (e.g. List<int>) tend to be faster than equivalent collections of Object (e.g. ArrayList) because they avoid boxing items.
- Generic collections over all types also tend to be faster because they do not incur a checked cast to obtain items from the collection.
- The static fields of a generic type are replicated, unshared, for each constructed type. The class constructor of a generic type is called for each constructed type. For example,
public class Counted<T>
{
public static int count;
public T t;
static Counted()
{
count = 0;
}
Counted(T t)
{
this.t = t;
++count;
}
}
Each constructed type Counted<int>, Counted<string>, etc. has its own copy of the static field, and the static class constructor is called once for each constructed type. These static member costs can quietly add up. Also, accesses to static fields of generic types may be slower than accesses to static fields of ordinary types.
- Generic methods, being generic, do not enjoy certain JIT compiler optimizations, but this is of little concern for all but the most performance critical code. For example, the optimization that a cast from a derived type to a base type need not be checked is not applied when one of the types is a generic parameter type.
Code Size Considerations
- The CLR shares IL, metadata, and some JIT’d/NGEN’d native code across types/methods constructed from generic types/methods. Thus the space cost of each constructed type is modest, less than that of an empty conventional non-generic type. But see also ‘current limitations’ below.
- When a generic type references other generic types, then each of its constructed types constructs its transitively referenced generic types. For example, List<T> references IEnumerable<T>, so use of List<string> also incurs the modest cost of constructing type IEnumerable<string>.
Current Code Sharing Limitations
In the current CLR implementation, native code method sharing for disparate generic type combinations occurs only for types constructed over reference type parameters (e.g., List<object>, List<string>, List<MyReferenceType>). Each type constructed over value type parameters (e.g., List<int>, List<MyStruct>, List<MyEnum>) will incur a separate copy of the native code for the methods in those constructed types. For comparison purposes, this is similar to the runtime cost of creating your own strongly typed collection class.
A consequence of this is that using a generic type defined in mscorlib in combination with value type parameters also from mscorlib could cause an NGen image to invoke the JIT during execution, resulting in a negative effect on performance. This is limited to mscorlib because it is the only assembly always loaded domain-neutral, and for a variety of reasons there are limitations on code sharing when working with domain-neutral assemblies. (Note: domain-neutrality is a load time decision, so it is possible that this would affect other assemblies, too.) For generic types that take a single type parameter, for example, t is relatively straightforward to determine whether this will affect your scenario: When instantiating G<VT>, where generic type G and value type parameter VT are both defined in mscorlib, you will JIT unless G<VT> is found in the following list:
ArraySegment<Byte>
Nullable<Boolean>
Nullable<Byte>
Nullable<Char>
Nullable<DateTime>
Nullable<Decimal>
Nullable<Double>
Nullable<Guid>
Nullable<Int16>
Nullable<Int32>
Nullable<Int64>
Nullable<Single>
Nullable<TimeSpan>
List<Boolean>
List<Byte>
List<DateTime>
List<Decimal>
List<Double>
List<Guid>
List<Int16>
List<Int32>
List<Int64>
List<SByte>
List<Single>
List<TimeSpan>
List<UInt16>
List<UInt32>
List<UInt64>
This is so because we have added some code to bake the data structures for these generic instantiations into mscorlib. Because it affects the working set of mscorlib, we couldn’t do it for every possible combination. For generic type instantiations that take multiple type parameters, the rules are more complex: Roughly, when instantiating G<T1…Tn>, where generic type G and each T in T1…Tn are defined in mscorlib, at least one of which is a value type, you will JIT unless G<T1…Tn> is found in the following list (note: substitute Object for any reference type parameter):
Dictionary<Char, Object>
Dictionary<Int16, IntPtr>
Dictionary<Int32, Byte>
Dictionary<Int32, Int32>
Dictionary<Int32, Object>
Dictionary<IntPtr, Int16>
Dictionary<Object, Char>
Dictionary<Object, Guid>
Dictionary<Object, Int32>
KeyValuePair<Char, UInt16>
KeyValuePair<UInt16, Double>
Please notice that JIT will neither occur when using a custom generic type with mscorlib type arguments (e.g. MyType<int>), nor when using an mscorlib type with your own type arguments (e.g. List<MyStruct>).
What is described above is actually the worst case scenario. There are some subtleties that could result in a more relaxed application of these rules. Unless the coverage above has caused you to worry whether you might be affected, it’s probably safe to skip this section. A more comprehensive explanation of these subtle variables follows:
Annotation (RicoM):
Let A be an assembly, G a generic type on n parameters, T1…Tn
A generic type G<T1,…,Tn> shares code with type H<S1,…,Sn> if and only if
- G = H and,
- for all i in [1..n] either
- Ti = Si, or,
- Ti shares code with Si, or,
- both Ti and Si are reference types.
------
Assume that A has been NGen'd, then:
If G is defined in A
- A may use G<T1…Tn> with no restrictions on T1…Tn, no JITting is required,
- A will include G<Object,…,Object> even if it is not otherwise mentioned
- The above two uses are present in A for other assemblies to use
(Remaining cases G not defined in A)
If all of T1…Tn are defined in A
- A may use G<T1…Tn> with no restrictions on T1…Tn, no JITting is required
- Any such G<T1…Tn> will be present in the NGen’d image of A for other assemblies to use
(Remaining case at least one of T1..Tn not defined in A)
If A depends on assembly B and B has a type that shares code with G<T1…Tn>
- A may use G<T1…Tn> as found in B, no JITting is required
- A will not contain code for G<T1…Tn>
(Remaining case, no match possible, this is the fallback position)
A copy of G<T1…Tn> will be emitted into the NGen'd code for A, this code is available for other assemblies to use (subject to these same rules)
If A is loaded domain-specific and G, T1..Tn are all loaded from domain-neutral assemblies then
- The CLR will be unable to use the copy in A and will JIT a new one
Otherwise
- The copy of the code in A is used and there is no JITting
These rules apply transitively to all generic instantiations encountered when JITting the code for both the non-generic classes (which may contain generic members), and generic classes (which may have generic members or parameters themselves) and the “seed” <Object,…,Object> instantiations in the assembly.
In the future, more method sharing may be possible, but a high degree of code sharing over different struct type parameters is unlikely or impossible.
Summary
In summary, from the performance perspective, generics are a sometimes-efficient facility that should be applied with great care and in moderation. When you employ a new constructed type formed from a pre-existing generic type, with a reference type parameter, the performance costs are modest but not zero. The stakes are much higher when you introduce new generic types and methods for use internal or external to your assembly. If used with value type parameters, each method you define can be quietly replicated dozens of times (when used across dozens of constructed types).
- Do use the pre-defined System.Collections.Generic types over reference types. Since the cost of each constructed type AGenericCollection<ReferenceType> is modest, it is appropriate to employ such types in preference to defining new strongly-typed-wrapper collection subtypes.
- Do use the pre-defined System.Collections.Generic types over value types in preference to defining a new custom collection type. As was the case with C++ templates, there is currently no sharing of the compiled methods of such constructed types – e.g. no sharing of the native code transitively compiled for the methods of List<MyStruct> and List<YourStruct>. Only construct these types when you are certain the savings in dynamic heap allocations of avoiding boxing will pay for the replicated code space costs.
- Do use Nullable<T> and EventHandler<T> even over value types. We will work to make these two important generic types as efficient as possible.
- Do not introduce new generic types and methods without fully understanding, measuring, and documenting the performance ramifications of their expected use.
 Tuesday, March 22, 2005
I guess maybe I'm a moron, but I just don't get posts like these (here and here).
Targeting the CLR is just like targeting any other machine, such as x86 for example. It just happens to be a little easier. Fortunately, we provide a big bag of pre-built abstractions. If you want to party with a category of other languages that already target the platform, you can choose to take advantage of these abstractions and be labeled CLS compliant. I'll be the first to admit that this entails some sacrifice, and a bit of kowtow to the C# gods. But it's a freaking choice. And not doing it leaves you no worse off than if you chose to target a different platform. Further, you can opt to produce verifiable code and take advantage of the runtime's security sandbox. This too requires little sacrifice, although your solutions might not seem as clever. Or not.
You're free to build up your own abstractions. You still get a bunch for free, like a kickass GC, a crapload of libraries (my Scheme runtime lib mostly consisted of one or two calls into Framework code), tools for debugging/profiling, and so on. Write 'em from scratch if you wish, it's all good. I know a lot of people enjoy writing GC's in their spare time (I'm not trying to be funny... it seems like a terribly complex and challenging exercise).
But, if somebody claims they can't write a dynamic language compiler on the CLR, well, they're certainly incapable of producing a dynamic language compiler that targets x86 or any other machine/VM. Or they're lying. So it hardly matters. It doesn't require too much imagination.
If somebody claims they can't write a dynamic language compiler on the CLR that interopates well with other CLR languages, we're working on making this better. But it seems that this is an even bigger challenge if you intend to interoperate at a layer above the platform altogether. So again, things certainly aren't worse on the CLR.
But hey... I'm now just another Redmond-ite who has had a little too much kool-aid for dinner. Oh! I'm late for my evening brainwashing session. BillG is good. BillG is great. ;)
 Monday, March 21, 2005
Just a quick reminder, as Brad and Kit mentioned, a bunch of the CLR and BCL team will be at the Chili's in the Bellevue Crossroads mall tomorrow (Tue, 3/22). Starts at roughly 5:30pm... I'm hoping there will be more customers and partners than Microsoft employees!
 Sunday, March 20, 2005
I was just working on some early work on the STG->IL compiler I mentioned last week. Perhaps my brain is just shifting a little too much into functional mode, but I designed my entire AST to avoid any fields for almost all nodes. Things that would ordinarily call for fields are just pairs (or higher order combinations of pairs, such as triples and lists). I don't know if this is good, bad, or whatever, but it's certainly interesting.
For example, given the nonterminals:
prog : binds binds : var1 = lf1; ...; varn = lfn (where n >= 1) lf : varsa \u varsb -> expr
Well, prog just derives from binds. Binds is simply a list of pairs of vars and lfs (lambda forms). Vars are just symbols (which themselves are just strings, but alas I can't derive from string in the Framework since it's sealed, so here I do need a field unless I choose to just use List<char>), and lambda forms are just quintuples of other stuff. All of these use a marker interface IAstNode to indicate that they're an AST node rather than cluttering up their inheritence hierarchy.
For example:
interface IAstNode {} class Program : IAstNode, BindingList {} class BindingList : IAstNode, List<Binding> {} class Binding : Pair<Variable, LambdaForm> {} class Quintuple<A,B,C,D> : Pair<A,Pair<B,Pair<C,D>>> class LambdaForm : Quintuple<VariableList, UpdateFlag, VariableList, Expression> {} ...and so on...
Why do I find this so much more beautiful than the dirty, messy, more verbose field-based approach? I mean, everybody always talks about the is-a thing when preaching OOP... But if a program is-a list of bindings, and a binding is-a blah why not reflect that in the type relationships?
Anyhow, this isn't thought through too much. But it certainly seems clearer. For example, we do a lot of work in the framework to make strings seem like lists of characters... Why fake it? E.g. class String : List<char> { /*string-specific functions*/ }. Surely string is conceptually closer to having a list of characters being its direct supertype than plain old vanilla object. (Although on second thought the realities of performance and such probably limit the extent to which we could do this.) I guess it's the old aggregation versus inheritence argument. Blegh.
But it's really nice that I can program like this if I wish. Being able to compose powerful abstractions through nothing but inheritence and polymorphism is a great feature of and a testament to the CTS.
BTW: How did we all survive w/out generics pre-2.0?
 Friday, March 18, 2005
I recently sent this out to an internal audience. I saw no reason not to share it with external folks, too... Although most of it probably won't be of interest, maybe somebody out there will get something useful from it.
Atomicity & Asynchronous Exception Failures
We often get asked how Framework developers should write atomic paired operations that are reliable in the face of asynchronous exceptions, the canonical example being thread aborts. These operations might be the acquisition and release of a lock or the allocation and deallocation of some unmanaged resource, for example. Not dealing with the risk that an exception might interrupt these (hence breaking atomicity) might cause undesirable reliability, or possibly even security, problems for you. A lot of people wonder how paranoid they should be when coding defensively against these scenarios.
This email is intended to clarify today’s state of affairs (as of Whidbey Beta2), provide guidance to those writing reusable managed code, and to shed a little insight into our thinking around where we might go in the future. To be entirely transparent, the top priority is first to convince you that, in almost all cases, you don’t need to and shouldn’t be writing the paranoid code necessary to deal with such problems. Only after that’s out of the way will we discuss how to do what you originally might have thought you wanted to do (although, hopefully by then you’ll have been convinced otherwise).
This is meant to compliment Chris Brumme’s recent email “Exceptions, Security & Frameworks,” available here. In particular, the thread abort section #9. A lot of this will make its way into the Design Guidelines document in a more easily consumable form over the coming weeks.
A quick overview
We consider it the responsibility of Framework code to guarantee cleanup of state which spans AppDomains. It is also the responsibility of Framework code to block threads in a manner in which the CLR can take control of them. (For instance, use WaitHandle.WaitOne rather than a P/Invoke to WaitForSingleObject.) And it is also the responsibility of Framework code to tolerate AppDomain unloads. This includes tolerating a certain level of inconsistency, as seen by Finalize methods and AppDomain.DomainUnload events during an unload. There are some things Framework code isn’t responsible for, though. It is entirely up to the host to deal with and contain possible inconsistent or corrupt state inside an AD that occurs as a result of raising asynchronous exceptions on some opaque executing thread. Further, it is the responsibility of code which consumes an API to recover deterministically as needed from operations that might fail, not that of the API itself. This last paragraph is dense and perhaps the core message of this email, so you might want to go back and reread it.
Let’s examine some implications of this, using thread aborts as a running example. Asynchronous thread aborts that are part of an AD unload are acceptable, and Framework code needs to tolerate them. Random aborts that have nothing to do with an unload are not fine, and the Framework generally need not remain consistent when they occur. The host is making a decision that the risk of inconsistency is less than the value of injecting these kinds of aborts and then continuing to run. SQL Server can make this decision because it monitors whether the threads it aborts are modifying shared state at the same time.
Asynchronous exceptions are tricky since they manifest as exceptions originating at seemingly any instruction in code running on a thread, essentially introducing nondeterministic races all over the place. By asynchronous, this just means that the target of a thread abort is different than the thread actually asking for the abort, either through the Thread.Abort() API or as part of the AppDomain unload process. Although saying “any instruction” is a bit of an overstatement. We don’t process thread aborts if you’re executing inside a Constrained Execution Region (CER), finally block, catch block, .cctor, or while running inside unmanaged code. If an abort is requested while inside one of these regions, we process it as soon as you exit (assuming you’re not nested inside another).
Synchronous thread aborts (e.g. Thread.CurrentThread.Abort()) aren’t a concern at all since the result is analogous to manually throwing a new ThreadAbortException exception (with the exception that they get re-raised after catch blocks). This is completely deterministic and happens at well defined points in your code; thus, it doesn’t carry the same risk of corruption as asynchronous aborts and won’t be discussed further.
Atomic pairs
If you have a piece of code that introduces a state change, makes some side effect, or acquires or allocates a resource, for example, there will usually be a paired operation intended to roll back the operation. One normally has a desire to make these two things atomic. That is, if one occurs, the other is also guaranteed to occur. If you open a file, you normally need to make sure you close it. After you acquire a lock on an object, you probably want to release it when you’re done so that you avoid deadlocks. And further, the soonest you can do this the better. Deterministic (or eager) action is usually preferable to nondeterministic (or late). While these things are reasonably simple to achieve about 99.x% of the time, achieving that extra 0.(1-0.x)% is extraordinarily difficult, and indeed seldom justifies the complexity and difficulty of trying to become resilient.
As a simple example, all of this means that given trivial code like this:
using (Foo x = new Foo()) { // … }
It can fail in nontrivial ways. For example, if a ThreadAbortException were raised sometime between the invocation of Foo’s constructor and the assignment to x, then Dispose() will never get called on the instance that got created. This is because at the end of the scope, x is still null (since it was never assigned a value). Assuming some resources got allocated in Foo’s constructor, it’s the responsibility of Foo’s finalizer to clean up after itself at this point. It should also be noted this problem can also occur if Foo’s constructor throws an exception, which is a particularly good reason to avoid throwing exceptions from constructors and to instead prefer acquisition of resources inside discrete methods.
Now, hopefully Foo was written to have a finalizer that will eventually clean up such resources. Indeed, if Foo has a public constructor, then the author of Foo can have no guarantees that Foo is only used in the context of a ‘using’ statement. In other words, Foo would have no guarantees that its IDisposable interface is ever called on any of its instances. If we’re talking about publicly constructable objects, the only guarantees worth considering are the guarantees made to the client code.
While this does mean that whatever Foo introduced might take a little while to roll back, since we’re processing a thread abort it’s safe for Framework code to assume that we are unloading an AppDomain and therefore will be executing finalizers shortly anyhow, which means this is perfectly acceptable most of the time. (We talk about why this is so further below.) Here’s where the paranoia starts. What if surrounding code assumes that Foo will always have been disposed of once control escapes the using statement? In this case, this assumption doesn’t hold. If the surrounding code is privy to the internal details of what state Foo changes and how (such as knowing it creates a particular file on disk and cleans it up during Dispose, for example), it might be written against a false set of invariants. This might cause failures to ripple up the call stack.
Thankfully, as is the case with infrastructure exceptions like thread abort, the exception will be forced to re-raise at the tail end of any catch blocks. So long as catch blocks are written without such assumptions, at least non-catch code won’t execute and see surprising state. And so long as any state which spans a single AppDomain is cleaned up in finalizers, such failures don’t seem quite so bad.
Some simple dos and don’ts
There are some simple things you can do to make your code more robust without stepping over the paranoia threshold. As I stated above, most people writing Framework code shouldn’t even care about most of what appears after this section.
1. We don’t expect most of the Framework to be willing or able to recover fully from asynchronous exception. In fact, the sole responsibility of code executing during a thread abort or an AD unload is to fix up corruption to state that spans AppDomains. By span, this just means that the management and lifetime of that state is orthogonal to that of the AppDomain executing code which is manipulating it. It’s safe to assume that, if code is subject to a thread abort, it will be shortly followed by an AD unload. Your goal should be to make “shortly” as short as possible, namely by reducing the amount of work you do as a result of these events. This includes finally blocks, finalization, and AD unload event handlers.
This guidance immediately rules out having to worry about protecting state which is entirely isolated inside an AD, such as managed object monitors, for example. This is true, of course, unless doing so would subject your code to possible security holes, in which case you might have to worry at least a little about this. A malicious person who knows you end up violating a bunch of invariants because you didn’t write code to deal with a thread abort might use this knowledge to find new and interesting exploits. If you’re doing thread impersonation or some other scary security elevation, you likely want to guarantee (via ExecutionContext.Run) that it gets reverted before passing control back up the stack, even in the face of thread aborts. Fortunately, aborting a thread does demand privileges not granted to most partial trust callers, but this does not rule out some bug exposing a reproducible way to provoke aborts. Not likely, but not impossible either.
2. Most of the time, you can (and should) rely on lazy cleanup to prevent leaks. Eager cleanup is useful, but you should always use a finalizer to guarantee that important state gets rolled back and that resources get reclaimed. Better yet, use SafeHandle[http://blogs.msdn.com/bclteam/archive/2005/03/16/396900.aspx], especially for cross-AD state. SafeHandle uses critical finalization to ensure execution even in the face of rude thread aborts and unloads. Regular finalizers won’t get a chance to run in such situations. True, lazy cleanup can lead to undesirable intermediary situations, namely while the abort is getting propagated and other cleanup code executed, but it’s certainly better than letting a leak past the AD unload entirely. The benefit of forcing eager cleanup in the face of the rare occurrences described in this paper is not worth the significant cost you would have to incur. Try not to write code that depends too intimately on eager cleanup having taken place, especially inside other cleanup code.
3. Don’t use finally blocks to intentionally delay asynchronous exceptions. While a clever observation, writing your code entirely inside finally blocks to avoid an asynchronous from being injected between a block of paired operations is a horribly bad practice. This is especially true of long running sections of code, especially those which have a blocking operation thrown into the mix. Doing this holds up processing of thread aborts and can hang AD unloads. This will result in a poor application experience for those who rely on your code.
For example, consider this snippet:
ReaderWriterLock rwl = /*…*/; bool taken = false; try { try {} finally { rwl.AcquireWriterLock(-1); taken = true; } // do some work } finally { if (taken) rwl.ReleaseWriterLock(); }
Yes, this prevents a normal asynchronous thread abort from occurring between the lock acquisition and entrance into the try block, but it also introduces the significant unintended consequences cited above. Moreover, if your code blocks upon trying to acquire a resource, we wouldn’t be able to abort it until it unblocks itself. If it’s in a deadlock or long-running acquisition, this would be pretty horrible.
4. Start paired operations inside a try block when possible. This might deviate from guidance given in the past, but nonetheless helps to eliminate the window between an acquisition and entrance into the try block. While this window can seem tiny (e.g. acquiring a lock right before entering a try), the JIT is free to inject as much code inside of it as it sees fit. This will undoubtedly be a small quantity, but nonetheless increases the window’s duration and hence likelihood of being interrupted. With the window eliminated entirely, however, it means that you can be assured your finally block will run during an abort. It does complicate matters slightly, as now you can’t be certain whether the operation succeeded or not, and therefore whether you have any state rollback to perform in your finally.
There are a few ways to deal with this, but in general your finally block should be resilient to failure. This might mean trying an operation and swallowing a resulting exception (like releasing a lock that never got successfully acquired), or using an out parameter in the acquisition method. Unless you use a CER or finally block to do the acquisition, you can’t be assured that you ever completed an assignment or operation fully. But in (hopefully) most cases, you won’t need to know. And this is usually more attractive than worrying whether your finally block will get a chance to run.
5. Always use the ‘using’ and ‘lock’ statements in C# as appropriate. If you’re familiar with the code that gets generated for these, you might notice a discrepancy between #4 and #5. This is true. Both of these constructs expand to code which does the acquisition right before entering the try block, which as we established in the opening section, could lead to missed deterministic cleanup. But as we’ve also already established, most of this shouldn’t be a concern to you by now. In fact, the ‘lock’ statement has a hack to ensure that, at least in the absence of a rude abort or rude AD unload, the unlock will always occur. (Unfortunately, because it’s a JIT hack, it’s not guaranteed to happen across all platforms.) By using these constructs, we can optimize your code for free in the future if and when we reassess how better to make the code they emit (or how the JIT responds to such code, as in the case of ‘lock’) more reliable in the face of asynchronous failures.
6. Never initiate asynchronous thread aborts in the Framework. This is perhaps too strongly worded, but most code should never try to abort an opaque thread. Only sophisticated hosts who are prepared to deal with the ensuing corruption and inconsistent state inside an AD should ever attempt to do this, and even then I emphasize the word “attempt.” If you’re doing this, know that we are recommending Framework code to be written with the assumption that an abort will be followed shortly by an unload. If you don’t abide by this policy, code probably won’t react as you might have hoped. There’s an MDA in Whidbey that catches this. We expect people writing applications to make this mistake, but please, please, please just don’t do it from Framework code.
7. Use CERs sparingly. CERs can be used to patch up known problems and to protect state that spans ADs, but unless you know for sure that one of these applies to you, don’t even consider using one. You need to execute under particularly bad circumstances, and truthfully doing it right is rocket science. Low level infrastructure code will use CERs more and more in the future, but for most of the code that builds on top of this infrastructure, CERs aren’t necessary.
Stress failures, complexities
The picture of the world depicted above is a bit naive. For example, it suggests that orphaning an object lock is acceptable during an AD unload. While in theory this is a fine thing to live with, there’s a set of code which will run during an unload that might make assumptions about what invariants have or have not been kept in place. This is one of the reasons why it’s so important to reduce the complexity and quantity of code that runs in such situations, and in particular to reduce as many inter-dependencies as possible. And if an asynchronous exception was raised and the AD isn’t actually being unloaded, then certainly a whole host of things are sure to be wrong within the AD’s boundaries afterwards. Most of these problems become magnified when examined under high stress loads.
Orphaned monitors are an example of a perfectly isolated resource that can still cause problems. If an AD is getting unloaded and a) this involves multiple threads and b) their cleanup paths attempt any lock acquisition whatsoever, there is a real risk of deadlock due to a lock that never got released. But again, this is by and large an application problem, not one for the Framework to be concerned about. Most of the Framework is written not to be thread-safe... at least not to access precise shared locks across thread boundaries anyhow. The criticality of a deadlock, however, is precisely why we added special code to recognize the ‘lock’ pattern—this guarantees lock release in the face of asynchronous exceptions (although not during rude aborts and rude unloads).
It’s likely that inside hosts which aggressively perturb code with thread aborts, such as ASP.NET for example, innocent Framework code might fail in new and interesting ways. We’re not suggesting that we shouldn’t evolve to deal with these situations as they arise, simply that the cases are so sporadic and difficult to predict that developers shouldn’t proactively seek to fix problems that might not exist. Aggressive stress testing should uncover most of these problems.
Future direction
Of course, this section is highly speculative. The topics raised in this paper are something that we (the CLR team) intend to look very closely at in the Orcas timeframe. It’s likely that we’ll dream up some great new solutions. It’s also likely that any such solutions will require a combination of runtime, language, and Framework support to get right.
For example, we’ve begun to adopt the pattern of providing out parameters to confirm acquisition of a resource, where the acquisition occurs inside a protected region. This is mostly to avoid the difficulties explained in #3 and #4 above. For example, the Monitor.ReliableEnter method does just this (which is only internally available, so you’re out of luck if you’re not shipping inside mscorlib). It executes inside a CER and sets a bit to indicate that you’ve successfully taken the lock. This alleviates concern about #4 above causing problems. For example, one could imagine the C# ‘lock’ keyword emitted code like this in the future:
bool took = false; try { Monitor.ReliableEnter(foo, out took); // code inside synchronized block } finally { if (took) Monitor.Exit(foo); }
So long as you abide by rule #5 above, you will get the benefits of whatever innovation we do for free, and this doesn’t rely on any JIT hackery. We will certainly look for more places to introduce such a pattern—and even make it public, too.
Dispose is unfortunately more difficult to solve. The goal here would be to first make sure the construction of an object and assignment to a local variable is atomic, and second to ensure there’s no window between the assignment and entrance into the try block. Without running constructors inside a protected region, however, there’s no obvious great solution. We certainly wouldn’t advise anybody to execute constructors inside a finally block, for example. But at the same time, we know we need to figure this one out. Doing resource allocations inside a protected factory method is one option, for instance, but only one that makes sense in a situation where you’d feel comfortable holding up an AD unload in order to protect state.
This guidance was creates mostly in response to constant feedback and questions we receive on the topic. A lot of this will make its way into the Design Guidelines in the form of more prescriptive guidance. This paper was largely an exploratory exercise resulting from conversations and meetings on the topic. In particular, I’d like to thank Chris Brumme, WeiWen Liu, Brian Grunkemeyer, Anthony Moore, and Dave Fetterman for their helpful feedback and suggestions, and indeed pushing the direction of the core message and points in the above text.
 Saturday, March 12, 2005
I’m in the process of writing up several articles/whitepapers that I intend to start publishing over the coming weeks.
- Dispose, Finalization & Resource Management Design Guidelines: I’ve been iteratively working on these updates over the past couple months and just submitted the whole thing to our core review group to solicit feedback, comments, and the like. It covers the recent unification work we did around the Dispose pattern in the Framework, along with general guidance on writing correct resource management code. For a bit of history around this, check out here and here. The end result turned out to be a bit longer and larger than I had originally intended, but I think I was able to lay it out in an intuitive and easily consumable way.
- Writing Atomic Abort-Safe Code: A lot of people writing code for the Framework lose a bunch of sleep over writing the most reliable code possible. However, our guidance on how to do this in the face of asynchronous thread aborts hasn’t been clear in the past, especially when writing paired operations (e.g. acquire/release semantics). So a bunch of us got together, discussed it, and decided we need to come up with some clear guidelines. I’m writing up a brief article and following it up with a DG update and some FxCop rule proposals. Priority #1: convince most people that they needn’t worry about these things. Only then will the painful details of abort semantics, AD unloads, CERs, critical finalizers, finallys, and the like be presented.
- Threading Security Best Practices Design Guidelines: During the recent Whidbey security push, we produced a lot of great content around how to write secure multi-threaded code. Unfortunately, for obvious reasons much of this must remain inside MS. But some of the more general guidance is being incorporated into the DG. This includes, for example, avoiding publicly-accessible locks, never accepting ReaderWriterLock LockCookies from untrusted sources, and avoiding message pumping inside a synchronized region. Hopefully this will help users of the Framework to write more robust and secure code, too.
- Concurrency & Parallelism: These topics are more personal research interests of mine, but nonetheless somewhat related to my job as the threading PM. I walk through a number of aspects about concurrency, parallelism (yes there’s a difference), and the nature of each. Specifically with regards to parallelism, I discuss the intrinsic algorithmic properties which are conducive to parallel execution, some theoretical math which demonstrates that, with the right multi-x (where x is core|proc) and task management architectures, the future looks promising. Check out this butterfly sorter. I recently had an interesting conversation with its author, Satnam Singh, and we seem to agree on many things. This particular example was designed and implemented in specialized hardware, but there’s no reason why we can’t write such things generically in software to take advantage of the underlying hardware support. I present some interesting evidence and conclusions to support this assertion.
- Compiling Scheme to IL with Sencha: Again, personal research area. I need to write up some findings based on my compiler work, what I consider to be the unique features of Sencha, and mostly just capture knowledge so I have something to refer back to. I’m thinking more seriously about another compiler effort, and have chatted with the GHC and MSR folks a bit about it. Basically, I intend to create a STG textual format and a corresponding STG-to-IL compiler, based on Simon’s Spineless Tagless G-Machine paper. I believe this could then be used easily as a backend to the GHC compiler. Very ambitious project with a limited user audience, but it seems to have a lot of interesting facets to it. Such software could be used for parallel research on the CLR in the future.
|
|
Me
Joe  is an architect and developer on a systems incubation project at Microsoft.
Recent
Search
Browse
Disclaimer:
The content of this site are my own personal opinions and do
not represent my employer's view in anyway.
© 2013, Joe Duffy
|
|