RSS 2.0

Personal Info:

Joe Send mail to the author(s) is a lead architect on an OS incubation project at Microsoft, and was the architect for Parallel Extensions to .NET. He is an author and frequent speaker.

Disclaimer:
The content of this site are my own personal opinions and do not represent my employer's view in anyway.

© 2010, Joe Duffy

 
 Monday, January 29, 2007

I previously mentioned the X86 JIT contains a "hack" to ensure that thread aborts can't sneak in between a Monitor.Enter(o) and the subsequent try-block.  This ensures that a lock won't be leaked due to a thread abort occurring in the middle of a lock(o) { S1; } block.  In the following example, that means an abort can't be triggered at S0:

Monitor.Enter(o);
S0;
try {
    S1;
} finally {
    Monitor.Exit(o);
}

If an abort could happen at S0, it'd be possible for a thread to acquire lock o, but before entering the try block, be asynchronously aborted, and then not run the finally block to release the lock on o.  This would lead to an orphaned lock, and probable deadlocks later on during execution.  Debugging an instance of such a deadlock would of course be rather difficult because it depends on a very subtle race condition that must occur within the tiny window of a single instruction.  On a single-processor machine, this would require a precariously placed context switch, but as more and more cores are added to the machines that this software runs on, the probability simply increases.

Characterizing this as a "hack" was a little harsh.  It's really just a byproduct of the way that the X86 JIT generates code.

For an asynchronous thread abort to be thrown in a thread, that thread must be either: (1) polling for the abort in the EE or (2) running inside of managed code.  And even if the thread is in managed code, we may not be able to abort it, as is the case if the thread is currently executing a finally block, inside a constrained execution region, etc.  The C# code generation for the lock statement ensures there are no IL instructions between the CALL to Monitor.Enter and the instruction marked as the start of the try block.  The JIT correspondingly will not insert any machine instructions in between the two.  And since any attempted thread aborts in Monitor.Enter are not polled for after the lock has been acquired and before returning, the soonest subsequent point at which an abort can happen is the first instruction following the call to Monitor.Enter.  And at that point, the IP will already be inside the try block (the return from Monitor.Enter returns to the CALL+1), thereby ensuring that the finally block will always run if the lock was acquired.

This might seem like an implementation detail, but the reality is that we can never change it.  Too many people depend on this guarantee.

It turns out that Whidbey's X64 JIT does not guarantee this behavior.  (I suspect IA64 doesn't either, but don't know for sure.)  In fact there's a high probability that this won't work: there is always a NOP instruction before the CALL and the instruction marking the try block in the JITted code.  This is done to make it easier to identify try/catch scopes during stack unwind.   This means that, yes indeed, an abort can happen at S0 on 64-bit.

This will likely be fixed for the next runtime release, but I can't say for sure.

Update 4/17/08: This was indeed fixed for the X64 JIT in Visual Studio 2008.  Note that when compiling C# code targeting both X86 and X64, if you do not use the /o+ switch, this problem can still occur due to extra explicit NOPs inserted before the try.

The framework implements a method Monitor.ReliableEnter, by the way, that could be used to avoid orphaning locks in the face of thread aborts, but it's internal to mscorlib.dll.  It sets an out parameter within a region of code that cannot be interrupted by a thread abort, which the caller can then check inside the finally block.  The acquisition then gets moved inside so that, if the CALL is reached, the finally block is guaranteed to always run.  You'd then write this instead:

bool taken;
try {
    Monitor.ReliableEnter(o, out taken);
    S1;
} finally {
    if (taken)
        Monitor.Exit(o);
}

It's also possible the CLR team would expose this API in the future.  We wanted to in Whidbey, but didn't have enough time.  If 64-bit code generation was changed so that it doesn't emit a NOP before the try block, however, we probably wouldn't need ReliableEnter after all.

1/30/2007 11:34:24 AM (Pacific Standard Time, UTC-08:00)
You got me worried Joe.

Does this only hold while acquiring locks or does this hold for every try block? In that case this would mean the C# 'using' statement isn't safe on 64 bit versions of the framework, because the creation of the resource used with that statement is placed before the try block.

Do you advice us to stop writing C#s using statement and start writing the following?

ResourceType resource = null;
try
{
resource = expression;
statement;
}
finally
{
if (resource != null) ((IDisposable)resource).Dispose();
}

And what must we do with our compiled DLL's that are stuffed with those using statements?
1/30/2007 3:32:45 PM (Pacific Standard Time, UTC-08:00)
Yes, Steven, this holds for every try block. We don't make the same guarantees for 'using' that we do with 'lock' for one major reason: resources allocated and disposed of with 'using' must also have finalizers. So even if a thread abort causes the call to Dispose to be skipped, you should be confident that a subsequent finalizer will clean things up. CLR monitors of course don't have the same handy cleanup mechanism.

The code you write is still prone to leakage. A thread abort can happen anywhere in between 'expression' and the assignment to 'resource'. Say 'expression' were 'new ResourceType()'; it's possible a completely new ResourceType was initialized, but not yet assigned to 'resource', when a thread abort occurs. In this case ResourceType had better contain a finalizer.

--joe
1/31/2007 5:13:07 AM (Pacific Standard Time, UTC-08:00)
Joe,

My world is falling apart. I always was in the misconception that using and try-finally blocks in managed code were completely safe, but you're telling me that leakage can happen anyway. Besides, if I'm not mistaking, the CLR doesn't guarantee finalizers to be called. This, of course, makes things even worse.

Isn't there any way the JIT can compile code that is guaranteed to be free from leakage?
1/31/2007 8:51:33 PM (Pacific Standard Time, UTC-08:00)
Steven, you can use a CER to increase the chances of running cleanup code, but speaking in terms of absolutes: it is ALWAYS possible that your cleanup code won't execute. All of these tools -- finally blocks, finalizers, CERs -- are meant to statistically increase the reliability of your app. But a fallback plan is always required. Your programs should be able to cope with state corruption that persists due to app crashes, rude shutdowns, and so on. This can be difficult in practice, but if you design it into your code to begin with, it's usually possible... even if it means asking the user to reboot the machine or manually repair some corrupted data (though clearly these are last resorts). Note that native code has the same category of problems, though some native apps have more control over things like shutdown and exceptions and so can make stronger guarantees.

Finally blocks + finalizers is enough for most managed programs. Statistically speaking, of course. :)

--joe
2/13/2007 3:11:48 PM (Pacific Standard Time, UTC-08:00)
It gets worse ...

Guess what the following snippet produces on x64


static void Main(string[] args)
{
Thread t = new Thread(new ThreadStart(Test));
t.Start();
Console.WriteLine("Abort");
t.Abort();
Console.WriteLine("Join");
t.Join();
Console.WriteLine("Done");
Console.ReadKey();


}

static void Test()
{
while (true)
{
try
{
Thread.Sleep(10);
}
catch(Exception e)
{
Console.WriteLine(e.ToString());
}
}
}
sambo99
8/4/2007 10:20:09 AM (Pacific Daylight Time, UTC-07:00)
PingBack

It is quite worse than you've shown it up in this blog, Joe.

Look at my recent post for some other locations where a ThreadAbortException shouldn't occur it the lock-statement.

http://tdanecker.blogspot.com/2007/08/do-never-ever-use-threadabort.html
9/22/2009 8:56:06 PM (Pacific Daylight Time, UTC-07:00)
Why does the lock keywords expands to a statement where Monitor.Enter is outside the try block, if Monitor.Enter was inside the try block the finally could have disposed the avoided the dangling lock should a thread abort happens?
7/12/2010 7:48:21 AM (Pacific Daylight Time, UTC-07:00)
MBTシューズ、それが唯一のフィットネスウォーキングシューズだ、これは、"MBTダイエット"あなたに言うです。真実を伝えるが私は1年前にMBT CHANGA着始めた。それは、カウンター越しに大きな影響を与えますレポートはまだ何もあります、単独で良い成績を得るよう、あなたは自分の体重で最高の記録を...私の人生があります。
Name
E-mail
Home page

Comment (HTML not allowed)  

Enter the code shown (prevents robots):

 

Recent Entries:

Search:

Browse by Date:
<September 2010>
SunMonTueWedThuFriSat
2930311234
567891011
12131415161718
19202122232425
262728293012
3456789

Browse by Category:

Notables: