RSS 2.0

Personal Info:

Joe Send mail to the author(s) is a lead architect on an OS incubation project at Microsoft, and was the architect for Parallel Extensions to .NET. He is an author and frequent speaker.

Blogroll:
Other
News
 C|Net
 Kuro5hin
 The Register
Technology
 <?xmlhack?>
 Daily WTF
 DevX
 Hacknot
 Java Today
 Microsoft Top 10 Downloads
 MSDN
 MSDN: "Longhorn"
 MSDN: XML Developer Center
 Slashdot
 Techdirt
 theserverside.com
 W3C
 Web Pages That Suck
 XML Cover Pages
 XML Journal
 xml.com
Technology Blogs
 Aaron Skonnard [PluralSight]
 Adam Bosworth [Google]
 Andy Rich [MS/C++]
 Arpan Desai [MS/XML]
 BCL Team [MS]
 Bill Clementson [Lisp]
 Bill de hÓra
 Bruce Eckel [J]
 Bruce Tate [J]
 Casey Chestnut
 Cedric Beust [Google]
 Chris Anderson [MS/Avalon]
 Chris Lyon [MS]
 Christian Weyer
 Clemens Vasters [newtelligence]
 Craig Andera [PluralSight]
 Dan Sugalski [Parrot]
 Daniel Cazzulino
 Dave Chappel
 Dave Roberts [Lisp]
 Dave Thomas [PragProg]
 Dave Winer
 Dion Almaer [J]
 Don Demsak
 Doug Purdy [MS/Indigo]
 Drew Marsh
 Eric Gunnerson [MS]
 Eric Rudder [MS]
 Eric Sink
 Fritz Onion [PluaralSight]
 Gavin King [J/Hibernate]
 Grady Booch [IBM]
 Hervey Wilson [MS/Indigo]
 Hillel Cooperman [MS/Shell]
 Howard Lewis Ship [J/Apache]
 Ingo Rammer [PluralSight]
 James Gosling [J/Sun]
 James Strachan [J/Groovy]
 Jason Matusow [MS/OSS]
 Jeffrey Schlimmer [MS/Indigo]
 Joe Beda [Google]
 Joel Spoelsky
 Jon Udell
 Josh Ledgard [MS/Evang]
 Joshua Allen [MS]
 Lambda
 Larry Osterman [MS]
 Maoni Stephens [MS/CLR]
 Mark Fussell [MS/XML]
 Martin Fowler
 Martin Gudgin [MS/Indigo]
 Me
 Michael Howard [MS]
 Miguel de Icaza [Mono]
 Mike Clark
 Omri Gazitt [MS/Indigo]
 Pat Helland [MS/PAG]
 Pinku Surana
 Raymond Chen [MS]
 Rich Lander [MS/CLR]
 Rob Howard
 Rob Relyea [MS/Avalon]
 Robert Cringely
 S. Somasegar [MS/DevDiv]
 Sam Gentile
 Scoble [MS/Evang]
 Scott Guthrie [MS/WebNet]
 Scott Hanselman
 Sean McGrath [J]
 Simon Fell
 Stanley Lippman [MS/C++]
 Steve Maine
 Steve Swartz [MS/Indigo]
 Steve Vinoski
 Steven Clarke [MS/Usability]
 Stuart Halloway
 Ted Leung
 Ted Neward [DM]
 Tim Bray [Sun]
 Tim Ewald [Mindreef]
 Tim O'Reilly
 Werner Vogels [Amazon]
 Wintellect
 Yasser Shohoud [MS/Indigo]
Top 20
 Brad Abrams [MS/CLR]
 Chris Brumme [MS/CLR]
 Chris Sells [MS/Ultra]
 Cyrus Najmabadi [MS/C#]
 Dominic Cooney [MS/XAF]
 Don Box [MS/Ultra]
 Don Syme [MS/R]
 Guido van Rossum [Python]
 Herb Sutter [MS/C++]
 Ian Griffiths
 Jason Zander [MS/CLR]
 Jim Hugunin [MS/CLR]
 Joel Pobar [MS/CLR]
 Krzysztof Cwalina [MS/CLR]
 Patrick Logan
 Paul Graham
 Rico Mariani [MS/CLR]
 Rory Blyth [MS/DN]
 Sam Ruby
 Wesner Moise
VC/Business Blogs
 Ed Sim
 Fred Wilson
 Jonathan Schwartz [J/Sun]
 Lawrence Lessig [Stanford]
 Mark Cuban
 Michael Hyatt
 Pierre Omidyar
 Ross Mayfield
 VentureBlog
 Weekly Read
Wine, Food & Tea
 The Silk Road of Wine
 Vinography: a wine blog
 Wine Whys

Disclaimer:
The content of this site are my own personal opinions and do not represent my employer's view in anyway.

© 2010, Joe Duffy

 
 Tuesday, June 16, 2009

One of my many focuses lately has been developing a memory ordering model for our project here at Microsoft.  There are four main questions to answer when defining such a model:

  1. What are the ordering guarantees for ordinary loads and stores?
  2. What are the ordering guarantees for volatile loads and stores?
  3. What kinds of explicit fences are allowed?
  4. Where are fences used automatically, e.g. to preserve type safety and security?

These tend to be the differentiation points for any model.  Everything else is mostly commodity.  Not that there is much else, mind you, but respecting data dependence, not speculating ahead such that exceptions occur that wouldn't have occurred in a sequential execution, and so forth are all must haves, for instance.  Most interesting permutations of answers for these questions have already been explored, and industry consensus is being reached, so it would be better to say I've been picking a model rather than defining one.

What's interesting is that memory model designers are often colored by their favorite architecture du jour.  If somebody cares primarily about X86, they are apt to choose something very strong.  If somebody cares primarily about ARM, however, they are apt to choose something very weak.  There is a classic tradeoff here.  Stronger means easier to program, while weaker means better performance.  For some reason, many of the projects I've worked on have had an abundance of strong hardware (like X86) and a scarcity of weak hardware (like ARM and IA64).  The reality sinks in: most developers on the team code to X86, and then when it comes time to getting more serious about the other platforms, code starts breaking all over the place.  This is why the CLR went so strong in 2.0, even though IA64 was an important platform to support.

Let's look at some common answers to the above questions.

For #1:

  • C++, Visual C++, ECMA 1.0, Java Memory Model, and Prism: no ordering guarantees.
  • CLR 2.0: ordered stores, no ordering for loads.

For #2:

  • C++: prevents compiler-only code motion, but explicit fences are needed for processor ordering.
  • Visual C++, ECMA 1.0, and CLR 2.0: loads are acquire, stores are release ordered.
  • Java Memory Model: loads and stores are fully ordered (sequentially consistent).

For #3:

  • C++: implementation-specific.
  • Visual C++: intrinsics and Win32 APIs.
  • ECMA 1.0 and CLR 2.0: locks, and mostly Win32-style interlocked APIs.
  • Java Memory Model: locks, compare-and-swap, atomics, etc.

For #4:

Managed environments like the CLR and JVM need to ensure type safety, even if ordinary loads and stores are unordered.  This is nontrivial, because the boundary around type safety is blurred.  Certainly we must ensure garbage v-table pointers are not seen.  But is a thread allowed to read non-zeroed memory behind an object reference?  And can it contain garbage (e.g. "values out of thin air")?  What about writes done by mutator threads, including write barriers, while a concurrent collector is tracing objects in the heap?  Are array lengths part of the set of protected fields that mustn't be read out of order?  Strings, since they are commonly used for security checking?  And so on.

It is mainly the deep questions around #4, and also some simple compatibility struggles (around things like double checked locking), that caused the stronger answers for #1 in the CLR 2.0.

In any case, I'm advocating a very different approach than the traditional models.

We pick completely weak ordering for ordinary loads and stores, to enable efficient execution on weaker platforms like ARM, PowerPC, IA64, etc.  That part isn't new.  But here's the clincher.  No volatiles.  There are special variables that are used to communicate between threads (call them volatile if you'd like), but using them implies no kind of special automatic fencing.  Instead, whenever accessing such a variable, at the site of usage, the kind of fence desired must be used (compiler-enforced): full-fence (sequentially consistent), acquire-fence, release-fence, no-fence, or compiler-only-fence (for things like ensuring loads don't get hoisted as loop invariant).  Of course, certain kinds of fences are sprinkled throughout the system to guarantee type safety in all of the aforementioned places (and more), but these are implementation details.

(This approach is rather like Herb Sutter's Prism and C++0x atomics.  See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2664.htm.)

Particularly after managing teams who developed a plethora of lock free code, I love this approach.  I can review code and immediately understand what ordering invariants the developer assumed when writing the code.  This doesn't really make writing lock free code any simpler, except that it forces you to pause and think about things a bit more carefully than you may have otherwise.  But it certainly makes code easier to understand and maintain, and makes it clear to people that sprinkling volatile all over the place isn't going to save your butt: the only thing that will do that is careful thinking and engineering.

6/16/2009 11:53:26 PM (Pacific Daylight Time, UTC-07:00)  #    Comments [7]

 

Recent Entries:

Search:

Browse by Date:
<June 2009>
SunMonTueWedThuFriSat
31123456
78910111213
14151617181920
21222324252627
2829301234
567891011

Browse by Category:

Notables:

Currently Up To:

Reading...

Listening...

Watching...