RSS 2.0

Personal Info:

Joe Send mail to the author(s) is a lead architect on an OS incubation project at Microsoft, and was the architect for Parallel Extensions to .NET. He is an author and frequent speaker.

Blogroll:
Other
News
 C|Net
 Kuro5hin
 The Register
Technology
 <?xmlhack?>
 Daily WTF
 DevX
 Hacknot
 Java Today
 Microsoft Top 10 Downloads
 MSDN
 MSDN: "Longhorn"
 MSDN: XML Developer Center
 Slashdot
 Techdirt
 theserverside.com
 W3C
 Web Pages That Suck
 XML Cover Pages
 XML Journal
 xml.com
Technology Blogs
 Aaron Skonnard [PluralSight]
 Adam Bosworth [Google]
 Andy Rich [MS/C++]
 Arpan Desai [MS/XML]
 BCL Team [MS]
 Bill Clementson [Lisp]
 Bill de hÓra
 Bruce Eckel [J]
 Bruce Tate [J]
 Casey Chestnut
 Cedric Beust [Google]
 Chris Anderson [MS/Avalon]
 Chris Lyon [MS]
 Christian Weyer
 Clemens Vasters [newtelligence]
 Craig Andera [PluralSight]
 Dan Sugalski [Parrot]
 Daniel Cazzulino
 Dave Chappel
 Dave Roberts [Lisp]
 Dave Thomas [PragProg]
 Dave Winer
 Dion Almaer [J]
 Don Demsak
 Doug Purdy [MS/Indigo]
 Drew Marsh
 Eric Gunnerson [MS]
 Eric Rudder [MS]
 Eric Sink
 Fritz Onion [PluaralSight]
 Gavin King [J/Hibernate]
 Grady Booch [IBM]
 Hervey Wilson [MS/Indigo]
 Hillel Cooperman [MS/Shell]
 Howard Lewis Ship [J/Apache]
 Ingo Rammer [PluralSight]
 James Gosling [J/Sun]
 James Strachan [J/Groovy]
 Jason Matusow [MS/OSS]
 Jeffrey Schlimmer [MS/Indigo]
 Joe Beda [Google]
 Joel Spoelsky
 Jon Udell
 Josh Ledgard [MS/Evang]
 Joshua Allen [MS]
 Lambda
 Larry Osterman [MS]
 Maoni Stephens [MS/CLR]
 Mark Fussell [MS/XML]
 Martin Fowler
 Martin Gudgin [MS/Indigo]
 Me
 Michael Howard [MS]
 Miguel de Icaza [Mono]
 Mike Clark
 Omri Gazitt [MS/Indigo]
 Pat Helland [MS/PAG]
 Pinku Surana
 Raymond Chen [MS]
 Rich Lander [MS/CLR]
 Rob Howard
 Rob Relyea [MS/Avalon]
 Robert Cringely
 S. Somasegar [MS/DevDiv]
 Sam Gentile
 Scoble [MS/Evang]
 Scott Guthrie [MS/WebNet]
 Scott Hanselman
 Sean McGrath [J]
 Simon Fell
 Stanley Lippman [MS/C++]
 Steve Maine
 Steve Swartz [MS/Indigo]
 Steve Vinoski
 Steven Clarke [MS/Usability]
 Stuart Halloway
 Ted Leung
 Ted Neward [DM]
 Tim Bray [Sun]
 Tim Ewald [Mindreef]
 Tim O'Reilly
 Werner Vogels [Amazon]
 Wintellect
 Yasser Shohoud [MS/Indigo]
Top 20
 Brad Abrams [MS/CLR]
 Chris Brumme [MS/CLR]
 Chris Sells [MS/Ultra]
 Cyrus Najmabadi [MS/C#]
 Dominic Cooney [MS/XAF]
 Don Box [MS/Ultra]
 Don Syme [MS/R]
 Guido van Rossum [Python]
 Herb Sutter [MS/C++]
 Ian Griffiths
 Jason Zander [MS/CLR]
 Jim Hugunin [MS/CLR]
 Joel Pobar [MS/CLR]
 Krzysztof Cwalina [MS/CLR]
 Patrick Logan
 Paul Graham
 Rico Mariani [MS/CLR]
 Rory Blyth [MS/DN]
 Sam Ruby
 Wesner Moise
VC/Business Blogs
 Ed Sim
 Fred Wilson
 Jonathan Schwartz [J/Sun]
 Lawrence Lessig [Stanford]
 Mark Cuban
 Michael Hyatt
 Pierre Omidyar
 Ross Mayfield
 VentureBlog
 Weekly Read
Wine, Food & Tea
 The Silk Road of Wine
 Vinography: a wine blog
 Wine Whys

Disclaimer:
The content of this site are my own personal opinions and do not represent my employer's view in anyway.

© 2010, Joe Duffy

 
 Wednesday, August 22, 2007

Most managed code in the .NET Framework has not been hardened against asynchronous exceptions.  This includes out of memory (OOM) conditions and asynchronous thread aborts, and is entirely by design.  Hardening against OOM, for example, is historically an extraordinarily difficult feat, and few systems undertake the development and QA costs needed to do so.  (FWIW, the CLR VM is one such system.)  Simply failing gracefully is usually hard enough.  Failing gracefully is admittedly leaps and bounds easier in managed code because allocation failures are communicated via exceptions rather than return values, and are thus transitively propagated “by default.”  Thread aborts are even more difficult to harden against, however, because they can originate at any instruction (with a handful of exceptions).  Ensuring data invariants are protected for every single instruction is clearly just a little difficult.

These things are certainly not impossible.  With enough effort, you can make inroads toward solutions for both issues.  Portions of the .NET Framework have gone to such lengths.  For example, code that manipulates process-wide state spanning AppDomains needs to ensure that this state is not corrupted by an unfortunately placed thread abort when run inside systems like SQL Server that use aborts to tear down boundaries of isolation.  While possible, the important thing to understand here is that most of the .NET Framework is in fact not resilient to these things.  See this doc as an example of guidance the CLR team provided to other developers inside of Microsoft to this effect.  OOMs are in a similar category, though many subsystems take different, inconsistent approaches to memory allocation failures (e.g. WPF takes a different stance than WCF).

All of this is a long winded build up to the following problem:  thread interrupts are just about as evil as these sorts of asynchronous exceptions.  The failure injection points are more constrained—e.g. an OOM can occur wherever an allocation occurs, a thread abort can happen in between nearly any two instructions, and thread interruptions can only occur at blocking calls that transition the managed thread into the state WaitSleepJoin—but this doesn’t change the fact that most code is unprepared to deal properly with such interruptions.  Once again, it’s not that managed code cannot be constructed to be resilient to interruptions—in fact, it’s much easier than OOMs and thread aborts—it’s simply that the .NET Framework hasn’t been constructed to tolerate arbitrary interruptions.  If threads are calling into these APIs and thread interruptions are provoked, state corruption, memory leaks, and possible deadlocks can be left in the wake.

To take a brief example of where such a problem might crop up, imagine a thread has blocked on FileStream.EndRead because it is finishing some asynchronous IO operation.  After a brief inspection of the code, I’m convinced interrupting the call it makes to WaitHandle.WaitOne internally will lead to a memory leak:

    if (1 == Interlocked.CompareExchange(ref result._EndXxxCalled, 1, 0))
    {
        __Error.EndReadCalledTwice();
    }
    WaitHandle handle = result._waitHandle;
    if (handle != null)
    {
        try
        {
            handle.WaitOne();
        }
        finally
        {
            handle.Close();
        }
    }
    NativeOverlapped* nativeOverlappedPtr = result._overlapped;
    if (nativeOverlappedPtr != null)
    {
        Overlapped.Free(nativeOverlappedPtr);
    }

The method ensures only one call to EndRead can occur, and will throw on subsequent attempts.  So the above code will only ever run once.  Sadly, EndRead needs to free the NativeOverlapped structure used internally for asynchronous IO completion.  But because the call to Overlapped.Free follows the call to WaitOne, and doesn’t occur inside of a finally block, it won’t execute.  In summary: interrupt that call to WaitOne, and boom, we leak a NativeOverlapped object.  Whether or not this is disastrous of course depends on the precise scenario.  A few bytes here and a few bytes there can quickly add up, particularly for long running programs.  At least this particular example protects invariants sufficiently well to avoid state corruption that would lead to further unpredictability.  But recall that this is just one example.  In my experience, the BCL represents some of the most carefully written code in the .NET Framework, so this problem is undoubtedly scattered about all over the place.

Unfortunately, it’s become somewhat common advice that using thread interruption as a synchronization and control mechanism is a GoodThing™.  Andrew Birrell, a researcher from Microsoft Research, for example, suggested this in his paper “An Introduction to Programming with C# Threads”:

“Interrupts are most useful when you don’t know exactly what is going on. For example, the target thread might be blocked in any of several packages, or within a single package it might be waiting on any of several objects. In these cases an interrupt is certainly the best solution. Even when other alternatives are available, it might be best to use interrupts just because they are a single unified scheme for provoking thread termination.” (p33)

While I am sure this advice is well intentioned, it is extremely dangerous for the subtle reasons outlined above and can lead to reliability problems in any programs that follow it.  My recommendation is to build this kind of higher level synchronization into the code that you actually own, and handle shutdown and interruption logic yourself.  This is a bit cumbersome and is more work, but it also ensures that arbitrary blocking points in the libraries you use will not be affected by interruptions.

With the increase in hardware parallelism over the coming years, I worry that the use of interruptions will become more widespread as a popular technique developers use to control threads.  And as more and more of the .NET Framework uses higher degrees of concurrency, necessarily requiring more internal synchronization, the number of blocking points that are vulnerable to this kind of abuse will grow accordingly.  So, please, do your part… avoid Thread.Interrupt like the plague.  In fact, perhaps we should deprecate it.

8/22/2007 11:52:33 PM (Pacific Daylight Time, UTC-07:00)  #    Comments [11]

 

Recent Entries:

Search:

Browse by Date:
<February 2010>
SunMonTueWedThuFriSat
31123456
78910111213
14151617181920
21222324252627
28123456
78910111213

Browse by Category:

Notables:

Currently Up To:

Reading...

Listening...

Watching...