<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:pingback="http://madskills.com/public/xml/rss/module/pingback/" version="2.0">
  <channel>
    <title>Generalities &amp; Details: Adventures in the High-tech Underbelly</title>
    <link>http://www.bluebytesoftware.com/blog/</link>
    <description>Joe Duffy's Weblog</description>
    <language>en-us</language>
    <copyright>Joe Duffy</copyright>
    <lastBuildDate>Fri, 12 Apr 2013 18:06:33 GMT</lastBuildDate>
    <generator>newtelligence dasBlog 1.8.5223.2</generator>
    <managingEditor>joe@bluebytesoftware.com</managingEditor>
    <webMaster>joe@bluebytesoftware.com</webMaster>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=4ac787c5-df36-45bd-a2f1-4911dffa535d</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,4ac787c5-df36-45bd-a2f1-4911dffa535d.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      I am naturally drawn to teams that work at an insane pace. The momentum, and persistent
      drive to increase that momentum, generates amazing results. And it's crazy fun.
   </p>
        <p>
      In such environments, however, I've found one thing to be a constant struggle for
      everybody on the team -- leaders, managers, and individual doers alike: remembering
      to take the necessary time to do the right thing. This sounds obvious, but it's very
      easy to lose sight of this important principle when deadlines loom, customers and
      managers and shareholders demand, and the overall team is running ahead at a breakneak
      pace.
   </p>
        <p>
      A nice phrase I learned from a past manager of mine was, "sometimes you need to slow
      down to speed up."
   </p>
        <p>
      By taking shortcuts today, though attractive in that they help meet that next closest
      deadline, you almost always pay for them down the road. You might subsequently become
      quagmired in bugs because quality was comprimised from the outset. You may create
      a platform that others build upon, only to realize later that the architecture is
      wrong in need of revamping, incurring a ripple effect on an entire software stack.
      You may realize that your whole system performs poorly under load, such that just
      when your startup was beginning to skyrocket to success, users instead flee due to
      the poor experience. The manifestation differs, but the root cause is the same.
   </p>
        <p>
      The level of quality you need for a project is very specific to your technology and
      business. I'll admit that working on systems software demands different quality standards
      than web software, for example. And the quality demands change as a project matures,
      when the focus shifts from writing reams of new code to modifying existing code...
      although the early phases are in fact the most challenging: this is when the most
      critical cultural traits are not yet set but are developing, when things have the
      highest risk of getting set off in the wrong direction, and is when you are most likely
      to scrimp on quality due to the need to make rapid progress on a broad set of problems
      all at once.
   </p>
        <p>
      So how do you ensure people end up doing the right thing? Well, I'd be lying if I
      didn't say it is a real challenge.
   </p>
        <p>
      As a leader, it is important to create a culture where individuals get rewarded for
      doing the right thing. Nothing beats having a team full of folks that "self-police"
      themselves using a shared set of demanding principles.
   </p>
        <p>
      To achieve this, leaders needs to be consistent, demanding, and hyper-aware of what's
      going on around them. You need to be able to recognize quality versus junk, so that
      you can reward the right people. You need to set up a culture where critical feedback
      when shortcuts are being taken is "okay" and "expected." I've made my beliefs pretty
      evident in prior articles, however I simply don't believe you can do this right in
      the early days without being highly technical yourself. As a team grows, your attention
      to technical detail may get stretched thin, in which case you need to scale by adding
      new technical leaders that share, recognize, and maintain or advance these cultural
      traits.
   </p>
        <p>
      You also can't punish people for getting less done than they could have if they took
      those shortcuts. Many cultures reward those who hammer out large quantities of poorly
      written code. You get what you reward.
   </p>
        <p>
      In fact, you must do the opposite, by making an example out of the people who check
      in crappy code.
   </p>
        <p>
      Facebook has this slogan "move fast and break things." It may seem that what I'm saying
      above is at odds with that famous slogan. Indeed they are somewhat contradictory,
      however paradoxically they are also highly complementary. Although you need to slow
      down to do the right thing, you do also need to keep moving fast. If that seems impossible,
      it's not; but it sure is difficult to find the right balance.
   </p>
        <p>
      I have a belief that I'm almost embarassed to admit: I believe that most people are
      incredibly lazy. I think most quality comprimise stems from an inherent laziness that
      leads to details being glossed over, even if they are consciously recognized as needing
      attention. The best developers maintain this almost supernatural drive that comes
      from somewhere deep within, and they use this drive to stave off the laziness. If
      you're moving fast and writing a lot of code, strive to utilize every ounce of intellectual
      horsepower you can muster -- sustained, for the entire time you are writing code.
      Even if that's for 16 hours straight. If at any moment a thought occurs that might
      save you time down the road, stop, ponder it, course correct on the fly. This is a
      way of "slowing down to speed up" but in a way where you can still be moving fast.
      Many lazier people let these fleeting thoughts go without exploring them fully. They
      will consciously do the wrong thing because doing the right thing takes more time.
   </p>
        <p>
      I've developed odd habits over the years. As a compile runs, I literally pore over
      every modified line of code, wondering if there's a better way to do it. If I see
      something, I push it on the stack and make sure to come back to it. By the time I've
      actually commited some new code -- regardless of whether it's 10,000 lines of freshly
      written code, or a 10 line modification to some existing stuff -- chances are that
      I've read each line of code at least three times. I disallow any detail I see to slip
      through the cracks. And my mind obsesses over all aspects of my work, even during
      "off times" (e.g., eating dinner, walking down the hallway, etc). Each of these opportunities
      represents a chance to slow down, reflect, and course correct.
   </p>
        <p>
      Do I still miss thing? Sure I do. But that's why it's so critical to have a team around
      you who shares the same principles and will help to identify any shortcomings that
      I've missed.
   </p>
        <p>
      Another practice I encourage on my team is fixing broken windows. I'm sure folks are
      aware of the so-called <a href="http://en.wikipedia.org/wiki/Fixing_Broken_Windows">broken
      windows theory</a>, where neighborhoods in which broken windows are tolerated tend
      to accumulate more and more broken windows with time. It happens in code, too. If
      people are discouraged from stopping to fix the broken windows, you will end up with
      lots of them. And guess what, each broken window actually slows you down. As more
      and more accumulate, it can become a real chore to get anything meaningful done. I
      guarantee you will not be able to move very fast if too many broken windows pile up
      and start needing attention. Slowing down to fix them incrementally, as soon as they
      are noticed, speeds you up down the road.
   </p>
        <p>
      Building a quality-focused team isn't easy. But creating a culture that slows down
      to do the right thing, while simultaneously moving fast, provides an enormous competetive
      advantage. It's not as common as you might think.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=4ac787c5-df36-45bd-a2f1-4911dffa535d" />
      </body>
      <title>Software Leadership #4: Slow Down to Speed Up</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,4ac787c5-df36-45bd-a2f1-4911dffa535d.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2013/04/12/SoftwareLeadership4SlowDownToSpeedUp.aspx</link>
      <pubDate>Fri, 12 Apr 2013 18:06:33 GMT</pubDate>
      <description>&lt;p&gt;
   I am naturally drawn to teams that work at an insane pace. The momentum, and persistent
   drive to increase that momentum, generates amazing results. And it's crazy fun.
&lt;/p&gt;
&lt;p&gt;
   In such environments, however, I've found one thing to be a constant struggle for
   everybody on the team -- leaders, managers, and individual doers alike: remembering
   to take the necessary time to do the right thing. This sounds obvious, but it's very
   easy to lose sight of this important principle when deadlines loom, customers and
   managers and shareholders demand, and the overall team is running ahead at a breakneak
   pace.
&lt;/p&gt;
&lt;p&gt;
   A nice phrase I learned from a past manager of mine was, "sometimes you need to slow
   down to speed up."
&lt;/p&gt;
&lt;p&gt;
   By taking shortcuts today, though attractive in that they help meet that next closest
   deadline, you almost always pay for them down the road. You might subsequently become
   quagmired in bugs because quality was comprimised from the outset. You may create
   a platform that others build upon, only to realize later that the architecture is
   wrong in need of revamping, incurring a ripple effect on an entire software stack.
   You may realize that your whole system performs poorly under load, such that just
   when your startup was beginning to skyrocket to success, users instead flee due to
   the poor experience. The manifestation differs, but the root cause is the same.
&lt;/p&gt;
&lt;p&gt;
   The level of quality you need for a project is very specific to your technology and
   business. I'll admit that working on systems software demands different quality standards
   than web software, for example. And the quality demands change as a project matures,
   when the focus shifts from writing reams of new code to modifying existing code...
   although the early phases are in fact the most challenging: this is when the most
   critical cultural traits are not yet set but are developing, when things have the
   highest risk of getting set off in the wrong direction, and is when you are most likely
   to scrimp on quality due to the need to make rapid progress on a broad set of problems
   all at once.
&lt;/p&gt;
&lt;p&gt;
   So how do you ensure people end up doing the right thing? Well, I'd be lying if I
   didn't say it is a real challenge.
&lt;/p&gt;
&lt;p&gt;
   As a leader, it is important to create a culture where individuals get rewarded for
   doing the right thing. Nothing beats having a team full of folks that "self-police"
   themselves using a shared set of demanding principles.
&lt;/p&gt;
&lt;p&gt;
   To achieve this, leaders needs to be consistent, demanding, and hyper-aware of what's
   going on around them. You need to be able to recognize quality versus junk, so that
   you can reward the right people. You need to set up a culture where critical feedback
   when shortcuts are being taken is "okay" and "expected." I've made my beliefs pretty
   evident in prior articles, however I simply don't believe you can do this right in
   the early days without being highly technical yourself. As a team grows, your attention
   to technical detail may get stretched thin, in which case you need to scale by adding
   new technical leaders that share, recognize, and maintain or advance these cultural
   traits.
&lt;/p&gt;
&lt;p&gt;
   You also can't punish people for getting less done than they could have if they took
   those shortcuts. Many cultures reward those who hammer out large quantities of poorly
   written code. You get what you reward.
&lt;/p&gt;
&lt;p&gt;
   In fact, you must do the opposite, by making an example out of the people who check
   in crappy code.
&lt;/p&gt;
&lt;p&gt;
   Facebook has this slogan "move fast and break things." It may seem that what I'm saying
   above is at odds with that famous slogan. Indeed they are somewhat contradictory,
   however paradoxically they are also highly complementary. Although you need to slow
   down to do the right thing, you do also need to keep moving fast. If that seems impossible,
   it's not; but it sure is difficult to find the right balance.
&lt;/p&gt;
&lt;p&gt;
   I have a belief that I'm almost embarassed to admit: I believe that most people are
   incredibly lazy. I think most quality comprimise stems from an inherent laziness that
   leads to details being glossed over, even if they are consciously recognized as needing
   attention. The best developers maintain this almost supernatural drive that comes
   from somewhere deep within, and they use this drive to stave off the laziness. If
   you're moving fast and writing a lot of code, strive to utilize every ounce of intellectual
   horsepower you can muster -- sustained, for the entire time you are writing code.
   Even if that's for 16 hours straight. If at any moment a thought occurs that might
   save you time down the road, stop, ponder it, course correct on the fly. This is a
   way of "slowing down to speed up" but in a way where you can still be moving fast.
   Many lazier people let these fleeting thoughts go without exploring them fully. They
   will consciously do the wrong thing because doing the right thing takes more time.
&lt;/p&gt;
&lt;p&gt;
   I've developed odd habits over the years. As a compile runs, I literally pore over
   every modified line of code, wondering if there's a better way to do it. If I see
   something, I push it on the stack and make sure to come back to it. By the time I've
   actually commited some new code -- regardless of whether it's 10,000 lines of freshly
   written code, or a 10 line modification to some existing stuff -- chances are that
   I've read each line of code at least three times. I disallow any detail I see to slip
   through the cracks. And my mind obsesses over all aspects of my work, even during
   "off times" (e.g., eating dinner, walking down the hallway, etc). Each of these opportunities
   represents a chance to slow down, reflect, and course correct.
&lt;/p&gt;
&lt;p&gt;
   Do I still miss thing? Sure I do. But that's why it's so critical to have a team around
   you who shares the same principles and will help to identify any shortcomings that
   I've missed.
&lt;/p&gt;
&lt;p&gt;
   Another practice I encourage on my team is fixing broken windows. I'm sure folks are
   aware of the so-called &lt;a href="http://en.wikipedia.org/wiki/Fixing_Broken_Windows"&gt;broken
   windows theory&lt;/a&gt;, where neighborhoods in which broken windows are tolerated tend
   to accumulate more and more broken windows with time. It happens in code, too. If
   people are discouraged from stopping to fix the broken windows, you will end up with
   lots of them. And guess what, each broken window actually slows you down. As more
   and more accumulate, it can become a real chore to get anything meaningful done. I
   guarantee you will not be able to move very fast if too many broken windows pile up
   and start needing attention. Slowing down to fix them incrementally, as soon as they
   are noticed, speeds you up down the road.
&lt;/p&gt;
&lt;p&gt;
   Building a quality-focused team isn't easy. But creating a culture that slows down
   to do the right thing, while simultaneously moving fast, provides an enormous competetive
   advantage. It's not as common as you might think.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=4ac787c5-df36-45bd-a2f1-4911dffa535d" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=76aac194-b4cf-4bb3-aac6-26f1ab6c2b2e</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,76aac194-b4cf-4bb3-aac6-26f1ab6c2b2e.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      I mentioned a few months back that my team had collaborated with MSR to publish a
      paper to OOPSLA about some novel aspects of our programming language (see <a href="http://www.bluebytesoftware.com/blog/2012/10/28/UniquenessAndReferenceImmutabilityForSafeParallelism.aspx">here</a> and <a href="http://www.bluebytesoftware.com/blog/2012/12/08/ImperativeFunctional.aspx">here</a>).
   </p>
        <p>
      I was excited when Jonathan over at InfoQ asked to interview me about this work. We
      had a fun back and forth, and I hope the result helps to clarify some of the design
      goals and decisions we made along the way.
   </p>
        <p>
      You can check it out here: <b><a href="http://www.infoq.com/articles/Joe-Duffy-Safe-Parallelism">Uniqueness
      and Reference Immutability for Safe Parallelism</a></b>.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=76aac194-b4cf-4bb3-aac6-26f1ab6c2b2e" />
      </body>
      <title>InfoQ interview about safe concurrency</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,76aac194-b4cf-4bb3-aac6-26f1ab6c2b2e.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2013/04/11/InfoQInterviewAboutSafeConcurrency.aspx</link>
      <pubDate>Thu, 11 Apr 2013 18:53:29 GMT</pubDate>
      <description>&lt;p&gt;
   I mentioned a few months back that my team had collaborated with MSR to publish a
   paper to OOPSLA about some novel aspects of our programming language (see &lt;a href="http://www.bluebytesoftware.com/blog/2012/10/28/UniquenessAndReferenceImmutabilityForSafeParallelism.aspx"&gt;here&lt;/a&gt; and &lt;a href="http://www.bluebytesoftware.com/blog/2012/12/08/ImperativeFunctional.aspx"&gt;here&lt;/a&gt;).&gt;
&lt;/p&gt;
&lt;p&gt;
   I was excited when Jonathan over at InfoQ asked to interview me about this work. We
   had a fun back and forth, and I hope the result helps to clarify some of the design
   goals and decisions we made along the way.
&lt;/p&gt;
&lt;p&gt;
   You can check it out here: &lt;b&gt;&lt;a href="http://www.infoq.com/articles/Joe-Duffy-Safe-Parallelism"&gt;Uniqueness
   and Reference Immutability for Safe Parallelism&lt;/a&gt;&lt;/b&gt;.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=76aac194-b4cf-4bb3-aac6-26f1ab6c2b2e" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=891db0f8-4bbd-44fc-bdba-6bb8a44a7a8e</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,891db0f8-4bbd-44fc-bdba-6bb8a44a7a8e.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      It's really hard to build a great team. It can take years of hard work and an enormous
      amount of patience.
   </p>
        <p>
      The reality is that there's only a finite (read: small) number of truly amazing software
      developers in the world, especially compared to the opportunities and exciting projects
      available to them.
   </p>
        <p>
      And yet, great teams are fueled first and foremost by great people. I often liken
      this to the aphorism "a rising tide lifts all boats."
   </p>
        <p>
      The original meaning of the phrase of course had nothing to do with software. It was
      the notion that focusing on growth of the overall economy's GDP will necessarily have
      a positive impact on the incomes of individuals within that economy. Now, of course,
      it's not always true, and I'm no theoretical economist, however the basic idea in
      spirit is an intuitively interesting one.
   </p>
        <p>
      Applying this thinking to teams, it implies you should always strive to hire better
      and better people. That by doing so, the overall quality of the team will rise. Hiring
      better and better people has a nonlinear impact to the culture, because a team is
      not just a disjoint set of nodes, but is instead a fully connected graph of individuals
      who have conversations and collaborate together. A greater overall quality of the
      team means richer connections and more powerful, higher quality innovation and software.
      It means your chance of truly changing the world has grown nonlinearly as well.
   </p>
        <p>
      I strive to only hire people who are better than me, and better than people already
      on the team, in some interesting dimension. As soon as you let your high standards
      drop even an ounce, the average drops and there is a cumulative snowballing effect.
      The connections grow weaker, and a nonlinear drop in quality and innovation will occur.
      This is my nightmare scenario because it can go downhill very quickly.
   </p>
        <p>
        </p>
        <p>
      This applies to an entire company as well as individual teams. Including what can
      happen should the tides lower. The brain drain begins as a slow drip, and can quickly
      turn into a torrential downpour in an instant. It often starts from the top, because
      culture and hiring start from the top.
   </p>
        <p>
      Now, I will be the first to admit that raising the tide is hard. Damn hard, in fact.
      I have another phrase which is "always be on." That incredible engineer you worked
      with ten years ago just might be the piece missing in the puzzle today, and a good
      way to lift the boats. Opportunities come and go when you least suspect them, and
      you want those people to want to join your team. I have several individuals that literally
      took years of effort to recruit. And the wait was well worth it. This advice applies
      to individual contributers as much as it does to managers. You never know if in a
      few years, you'll be leading a team, kicking off your own startup, or even just helping
      to make your own team a better place.
   </p>
        <p>
      And as a leader you owe all of this to your existing team. By lifting the boats, your
      entire team benefits. They grow, learn new things, and reach new heights in their
      own careers.
   </p>
        <p>
      Despite being hard work, this all pays off the end. There is very little I find more
      satisfying in life than building and growing a great team, seeing the year over year
      improvements, and creating amazing things together. Perhaps even more than coding.
      (gasp)
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=891db0f8-4bbd-44fc-bdba-6bb8a44a7a8e" />
      </body>
      <title>Software Leadership #3: A Rising Tide Lifts All Boats</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,891db0f8-4bbd-44fc-bdba-6bb8a44a7a8e.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2013/03/16/SoftwareLeadership3ARisingTideLiftsAllBoats.aspx</link>
      <pubDate>Sat, 16 Mar 2013 19:38:53 GMT</pubDate>
      <description>&lt;p&gt;
   It's really hard to build a great team. It can take years of hard work and an enormous
   amount of patience.
&lt;/p&gt;
&lt;p&gt;
   The reality is that there's only a finite (read: small) number of truly amazing software
   developers in the world, especially compared to the opportunities and exciting projects
   available to them.
&lt;/p&gt;
&lt;p&gt;
   And yet, great teams are fueled first and foremost by great people. I often liken
   this to the aphorism "a rising tide lifts all boats."
&lt;/p&gt;
&lt;p&gt;
   The original meaning of the phrase of course had nothing to do with software. It was
   the notion that focusing on growth of the overall economy's GDP will necessarily have
   a positive impact on the incomes of individuals within that economy. Now, of course,
   it's not always true, and I'm no theoretical economist, however the basic idea in
   spirit is an intuitively interesting one.
&lt;/p&gt;
&lt;p&gt;
   Applying this thinking to teams, it implies you should always strive to hire better
   and better people. That by doing so, the overall quality of the team will rise. Hiring
   better and better people has a nonlinear impact to the culture, because a team is
   not just a disjoint set of nodes, but is instead a fully connected graph of individuals
   who have conversations and collaborate together. A greater overall quality of the
   team means richer connections and more powerful, higher quality innovation and software.
   It means your chance of truly changing the world has grown nonlinearly as well.
&lt;/p&gt;
&lt;p&gt;
   I strive to only hire people who are better than me, and better than people already
   on the team, in some interesting dimension. As soon as you let your high standards
   drop even an ounce, the average drops and there is a cumulative snowballing effect.
   The connections grow weaker, and a nonlinear drop in quality and innovation will occur.
   This is my nightmare scenario because it can go downhill very quickly.&lt;p&gt;
&lt;p&gt;
   This applies to an entire company as well as individual teams. Including what can
   happen should the tides lower. The brain drain begins as a slow drip, and can quickly
   turn into a torrential downpour in an instant. It often starts from the top, because
   culture and hiring start from the top.
&lt;/p&gt;
&lt;p&gt;
   Now, I will be the first to admit that raising the tide is hard. Damn hard, in fact.
   I have another phrase which is "always be on." That incredible engineer you worked
   with ten years ago just might be the piece missing in the puzzle today, and a good
   way to lift the boats. Opportunities come and go when you least suspect them, and
   you want those people to want to join your team. I have several individuals that literally
   took years of effort to recruit. And the wait was well worth it. This advice applies
   to individual contributers as much as it does to managers. You never know if in a
   few years, you'll be leading a team, kicking off your own startup, or even just helping
   to make your own team a better place.
&lt;/p&gt;
&lt;p&gt;
   And as a leader you owe all of this to your existing team. By lifting the boats, your
   entire team benefits. They grow, learn new things, and reach new heights in their
   own careers.
&lt;/p&gt;
&lt;p&gt;
   Despite being hard work, this all pays off the end. There is very little I find more
   satisfying in life than building and growing a great team, seeing the year over year
   improvements, and creating amazing things together. Perhaps even more than coding.
   (gasp)
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=891db0f8-4bbd-44fc-bdba-6bb8a44a7a8e" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=60ecff1a-a45c-4d26-b7e7-44e1ffe79f7d</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,60ecff1a-a45c-4d26-b7e7-44e1ffe79f7d.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      The very notion of "authority" is 90% in your head. And it's one that often holds
      back otherwise very capable people.
   </p>
        <p>
      This is yet another one that I got entirely wrong early in my own career.
   </p>
        <p>
      When starting a new job, it's natural to be in what I call "understanding and assessment"
      mode. In fact, coming into a new job and telling everybody how they are doing things
      wrong is a recipe to not only get you off on the wrong foot, but also permanently
      poison your relationship with what would have been very important allies down the
      road. However, it's critical to turn the corner at some point before it's too late.
      The more experience you have, the less time it should take.
   </p>
        <p>
      When I first came to Microsoft, I was suddenly surrounded by lots of smart people
      with momentum and energy on whatever it was that they were building. This led me to
      initially assume that these people knew what was going on. It led me to assume that,
      simply because some guy has the title of "Corporate Vice President" or "Distinguished
      Engineer", they knew what was going on. In fact, in my first day on the job -- and
      I will never forget this -- I was in a design meeting where I had the "audacity" to
      tell another DE (Microsoft's highest ranking technical position at the time) that
      I disagreed with what he was saying. I was polite about it, and to this day think
      I was right. However, I was pulled aside afterwards and told how stupid a move that
      was. And what's worse, I actually listened for my first two years and went out of
      my way not to rock the boat too much. This was against my better judgment. I was still
      young, and had come from another job where I was confident and could safely question
      anything; but I made the silly mistake of thinking "well, maybe things are different
      around here." I still wish I could get those two years back.
   </p>
        <p>
      Allow me to let you in on a little secret. (Well, okay, it's not really a secret,
      but if only I could go back and tell my younger self this. And I suppose it ought
      to be obvious.) These people don't always know what is going on. It's probably safe
      to assume that these people have been rewarded in their careers because, statistically
      speaking, they are right more often than they are wrong. But it's still just statistics.
      And truthfully, if they are any good, they will like being questioned. They enjoy
      the technical debate. This is a critical aspect of a great team.
   </p>
        <p>
      In fact, if they don't enjoy the debate, you are likely in the wrong place. The person
      who told me I was being stupid was actually right. The organization I had joined punished,
      rather than rewarded, people who questioned people in positions of authority. As soon
      as I realized this, I got out. It's a very personal preference, I suppose, but I personally
      prefer organizations that reward and promote people based on ability and direct impact.
      Sadly, in organizations where authority prevails, advancement is almost always based
      on who-likes-who, ass-kissing, and time-in-position. For folks looking for cozy jobs
      with guaranteed income, perhaps this is ideal; but you're quite unlikely to grow rapidly,
      build amazing things, and change the world in such places.
   </p>
        <p>
      The employees I love the most are those that ask tons of questions and aren't afraid
      to tell me when I'm wrong. These people are inquisitive about everything, whether
      the topic is highly technical or pure business. By questioning my own views, and forcing
      me to articulate them, there is an overall strengthening of the culture. Not only
      do I benefit as a leader by having to methodically think through and defend my approach
      to problem solving, however those around me also benefit because (a) often they end
      up influencing the organization in big ways, and (b) even if my original stance survives,
      they understand the rationale behind certain decisions and can grow as a result. And
      it's fun! -- albeit passionately heated at times.
   </p>
        <p>
      In fact, it's painful for me to see the opposite. An employee who has been trained
      to blindly respect authority. I was an admittedly rebellious youngster, so I've always
      known that this trait is simply not in my nature. I often reflect on how lucky I am
      to have landed in software rather than the military. Even when making those mistakes
      early in my career, I knew in my heart that they were mistakes at the time. But I
      do know that some people feel comfortable with a hierarchy of authority. They like
      the structure, and questioning it simply isn't an option. Sadly, some such people
      are beyond repair.
   </p>
        <p>
      These days, I have to say that kids who grew up programming in their teens are the
      most fearless and rebellious. I have a dirty admission: I love hiring and mentoring
      and growing these individuals the most, because they have yet to be "trained" to respect
      authority. Once a vulnerable person early in their career has been brainwashed, it
      is an incredibly difficult thing to reverse. And so, as their career progresses, the
      longer these habits sit unaddressed, the less salvageable they become. Thankfully
      I caught it early on. I've managed plenty of folks who didn't.
   </p>
        <p>
      Now, you can't be an asshole about it. And you can't be arrogant. Software is all
      about people and collaboration, and all of this questioning must be done with a single
      goal in mind: to make the software, the organization, and/or its people better. Authority
      is there for a reason, which is that ultimately someone needs to run the business,
      make decisions, and have their butt on the line. Sometimes the simple reality is that
      a leader's intuition is extremely good, and though data may be lacking to support
      the decision, you can trust it. It's okay to agree to disagree, or sometimes admit
      that someone simply has a stronger background than you in a particular area and so
      maybe you aren't in a position to fully understand why a certain decision was made.
      I have always tried to turn such occasions into learning opportunities. Jot down a
      few notes, and go read about it afterwards. I always jot down and research any term
      I hear that I'm not totally on top of, technical or business. It happens all the time.
   </p>
        <p>
      In your next job interview, go out of your way to question a thing or two. If the
      person on the other end acts offended, either you asked in the wrong way (remember:
      respectful but inquisitive), or you shouldn't take the job. If it's a startup, read
      the business plan ahead of time and come prepared with some hard questions. If it's
      a corporation, ask what is rewarded, find holes in the technical architecture, question
      areas of the engineering process that could be improved. I really do think this is
      one of the most important cultural traits of a well-run team. And I guarantee you'll
      have way more <b>fun</b> on such a team, perhaps the most important cultural trait
      of all.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=60ecff1a-a45c-4d26-b7e7-44e1ffe79f7d" />
      </body>
      <title>Software Leadership #2: Authority is an Illusion</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,60ecff1a-a45c-4d26-b7e7-44e1ffe79f7d.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2013/03/03/SoftwareLeadership2AuthorityIsAnIllusion.aspx</link>
      <pubDate>Sun, 03 Mar 2013 17:52:29 GMT</pubDate>
      <description>&lt;p&gt;
   The very notion of "authority" is 90% in your head. And it's one that often holds
   back otherwise very capable people.
&lt;/p&gt;
&lt;p&gt;
   This is yet another one that I got entirely wrong early in my own career.
&lt;/p&gt;
&lt;p&gt;
   When starting a new job, it's natural to be in what I call "understanding and assessment"
   mode. In fact, coming into a new job and telling everybody how they are doing things
   wrong is a recipe to not only get you off on the wrong foot, but also permanently
   poison your relationship with what would have been very important allies down the
   road. However, it's critical to turn the corner at some point before it's too late.
   The more experience you have, the less time it should take.
&lt;/p&gt;
&lt;p&gt;
   When I first came to Microsoft, I was suddenly surrounded by lots of smart people
   with momentum and energy on whatever it was that they were building. This led me to
   initially assume that these people knew what was going on. It led me to assume that,
   simply because some guy has the title of "Corporate Vice President" or "Distinguished
   Engineer", they knew what was going on. In fact, in my first day on the job -- and
   I will never forget this -- I was in a design meeting where I had the "audacity" to
   tell another DE (Microsoft's highest ranking technical position at the time) that
   I disagreed with what he was saying. I was polite about it, and to this day think
   I was right. However, I was pulled aside afterwards and told how stupid a move that
   was. And what's worse, I actually listened for my first two years and went out of
   my way not to rock the boat too much. This was against my better judgment. I was still
   young, and had come from another job where I was confident and could safely question
   anything; but I made the silly mistake of thinking "well, maybe things are different
   around here." I still wish I could get those two years back.
&lt;/p&gt;
&lt;p&gt;
   Allow me to let you in on a little secret. (Well, okay, it's not really a secret,
   but if only I could go back and tell my younger self this. And I suppose it ought
   to be obvious.) These people don't always know what is going on. It's probably safe
   to assume that these people have been rewarded in their careers because, statistically
   speaking, they are right more often than they are wrong. But it's still just statistics.
   And truthfully, if they are any good, they will like being questioned. They enjoy
   the technical debate. This is a critical aspect of a great team.
&lt;/p&gt;
&lt;p&gt;
   In fact, if they don't enjoy the debate, you are likely in the wrong place. The person
   who told me I was being stupid was actually right. The organization I had joined punished,
   rather than rewarded, people who questioned people in positions of authority. As soon
   as I realized this, I got out. It's a very personal preference, I suppose, but I personally
   prefer organizations that reward and promote people based on ability and direct impact.
   Sadly, in organizations where authority prevails, advancement is almost always based
   on who-likes-who, ass-kissing, and time-in-position. For folks looking for cozy jobs
   with guaranteed income, perhaps this is ideal; but you're quite unlikely to grow rapidly,
   build amazing things, and change the world in such places.
&lt;/p&gt;
&lt;p&gt;
   The employees I love the most are those that ask tons of questions and aren't afraid
   to tell me when I'm wrong. These people are inquisitive about everything, whether
   the topic is highly technical or pure business. By questioning my own views, and forcing
   me to articulate them, there is an overall strengthening of the culture. Not only
   do I benefit as a leader by having to methodically think through and defend my approach
   to problem solving, however those around me also benefit because (a) often they end
   up influencing the organization in big ways, and (b) even if my original stance survives,
   they understand the rationale behind certain decisions and can grow as a result. And
   it's fun! -- albeit passionately heated at times.
&lt;/p&gt;
&lt;p&gt;
   In fact, it's painful for me to see the opposite. An employee who has been trained
   to blindly respect authority. I was an admittedly rebellious youngster, so I've always
   known that this trait is simply not in my nature. I often reflect on how lucky I am
   to have landed in software rather than the military. Even when making those mistakes
   early in my career, I knew in my heart that they were mistakes at the time. But I
   do know that some people feel comfortable with a hierarchy of authority. They like
   the structure, and questioning it simply isn't an option. Sadly, some such people
   are beyond repair.
&lt;/p&gt;
&lt;p&gt;
   These days, I have to say that kids who grew up programming in their teens are the
   most fearless and rebellious. I have a dirty admission: I love hiring and mentoring
   and growing these individuals the most, because they have yet to be "trained" to respect
   authority. Once a vulnerable person early in their career has been brainwashed, it
   is an incredibly difficult thing to reverse. And so, as their career progresses, the
   longer these habits sit unaddressed, the less salvageable they become. Thankfully
   I caught it early on. I've managed plenty of folks who didn't.
&lt;/p&gt;
&lt;p&gt;
   Now, you can't be an asshole about it. And you can't be arrogant. Software is all
   about people and collaboration, and all of this questioning must be done with a single
   goal in mind: to make the software, the organization, and/or its people better. Authority
   is there for a reason, which is that ultimately someone needs to run the business,
   make decisions, and have their butt on the line. Sometimes the simple reality is that
   a leader's intuition is extremely good, and though data may be lacking to support
   the decision, you can trust it. It's okay to agree to disagree, or sometimes admit
   that someone simply has a stronger background than you in a particular area and so
   maybe you aren't in a position to fully understand why a certain decision was made.
   I have always tried to turn such occasions into learning opportunities. Jot down a
   few notes, and go read about it afterwards. I always jot down and research any term
   I hear that I'm not totally on top of, technical or business. It happens all the time.
&lt;/p&gt;
&lt;p&gt;
   In your next job interview, go out of your way to question a thing or two. If the
   person on the other end acts offended, either you asked in the wrong way (remember:
   respectful but inquisitive), or you shouldn't take the job. If it's a startup, read
   the business plan ahead of time and come prepared with some hard questions. If it's
   a corporation, ask what is rewarded, find holes in the technical architecture, question
   areas of the engineering process that could be improved. I really do think this is
   one of the most important cultural traits of a well-run team. And I guarantee you'll
   have way more &lt;b&gt;fun&lt;/b&gt; on such a team, perhaps the most important cultural trait
   of all.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=60ecff1a-a45c-4d26-b7e7-44e1ffe79f7d" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=b7eadb39-c7e5-4392-8657-8247a50da220</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,b7eadb39-c7e5-4392-8657-8247a50da220.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      The best people in software have an innate ability to communicate using code. They
      have an idea and simply code it up, thereby making it reality. In fact, the best people
      are, I would say, obsessed with code.
   </p>
        <p>
      Pick somebody in software that has done great things. Bill Gates comes to mind for
      me, because that’s who inspired me to get started in software. He wrote code for as
      long as he could manage, and famously delivered code reviews even as his company grew
      to 1,000s of engineers. No matter who you pick, I am sure one thing rings true: they
      obsessed over the details. And when it comes to software, those details are in the
      code.
   </p>
        <p>
      Those who cannot read and write code must spend all of their time convincing other
      people of their ideas, and are usually sufficiently disconnected from reality (i.e.,
      the code) that their ideas do not work in practice. This is an awful situation to
      be in, particularly at a company whose primary asset is code. Worse, most people voluntarily
      place themselves into this category, particularly over time in their careers, because
      they believe that coding is "not one of their job responsibilities." What rubbish!
   </p>
        <p>
      I have three particular pet peeve examples to give.
   </p>
        <p>
      The first is what I like to call the "mediocre mid-level manager syndrome." I’ll admit
      that when you manage large enough teams, you have to give up on a bit of coding. I
      will personally never give it up entirely, even as I manage teams of 1,000s of engineers.
      I will always use the product my team is building, I will read the checkins to at
      least understand what’s going on and stay grounded, and, assuming I continue to manage
      groups building development platforms, I will write code using that platform. But
      for managers who assume responsibility for 10 or fewer engineers, there is absolutely
      no excuse for slacking in these areas. It’s just pure laziness, and the teams suffer
      enormously: such teams typically lack "adult supervision" in the area of engineering
      culture, lack a role model, and build wrong and crappy things. In short, such managers
      literally add negative value. I can’t tell you how many "Software Development Leads"
      at large corporations fall into this category. That this is often culturally accepted
      is totally broken; needless to say, it is not acceptable on my teams.
   </p>
        <p>
      The second is something I call "code is beneath me." The two most prominent examples
      are folks late in their careers and researchers. The former often goes hand in hand
      with the mid-level manager problem. But I’ve seen it afflict software engineers too:
      "I’ve been a professional developer for 10 years, so my job is now to tell others
      what to do rather than doing anything myself." At this point, they might adopt the
      title Architect. The research issue, however, quite frankly perplexes me. Computer
      Science is an odd mixture of pure math and applied engineering. I get that many CS
      researchers are more math-oriented, and wish to basically do mathematics rather than
      software. I also get that much of this research bears fruit. But in my experience
      there is a very large contingency of researchers that do not produce "first rate mathematics,"
      and yet resist becoming grounded in code. The idea that you can improve the state
      of software, whose bloodline is code, without ever writing a line of it or becoming
      proficient in it, is complete insanity. And yet it’s generally accepted.
   </p>
        <p>
      The last example, which is close to home since I made this mistake myself for a couple
      years early in my career, is "I manage things, I don’t build them." The title of Program
      Manager is a specific manifestation of this problem. Most have backgrounds in CS,
      and have probably written a little code. But most PMs are also usually not very good
      at it, don’t love it, and probably haven’t written much since leaving school. And
      yet these people are often "in charge" of making decisions about features, prioritization,
      and competitive offerings. It’s true that some people have great intuition and can
      make some good decisions without knowing how things work. But when it comes to software,
      those abilities need to be grounded by the code. If that’s not interesting to them,
      I always encourage considering positions in sales and marketing, HR, or one of the
      other organizations in such companies that isn’t focused on actually building the
      software. I would literally abolish the PM position at my company if I could. Those
      who love code should become developers.
   </p>
        <p>
      I have the utmost respect for people who have fallen into any of these traps, but
      then realizes it and gets out. Hey, I did so myself.
   </p>
        <p>
      Even those that write code often don’t do it enough. I’ve seen so many fall into the
      trap of debating whether or not something would work, or how elegant it would be.
      Certain people are afraid of failure, or find it difficult to get motivated to "start"
      coding. The best people, however, realize that questions are easily answered by writing
      the code in prototype form. They go from 0-60 in an instant, having a vision of what
      they would like to build, and letting nothing get in the way. I call this "oozing
      code from your fingertips." I do think some of this is a skillset thing. In software,
      the top 20% are easily 50X more productive than the bottom 20%. But I also think these
      traits can be learned, given role models who exhibit and demonstrate the behavior.
   </p>
        <p>
      Finally, I do encourage software leaders to read as much code as they can. Reading
      code is a great way to learn how things work, and to stay on top of what’s actually
      happening in your project. And it keeps your mind fresh, and often leads to new ideas.
   </p>
        <p>
      If it isn’t obvious, I might have a slightly atypical bias here. But it’s one of the
      things I am most passionate about with respect to running software teams. Code speaks.
      Love the code.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=b7eadb39-c7e5-4392-8657-8247a50da220" />
      </body>
      <title>Software Leadership #1: Code Speaks; Love the Code</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,b7eadb39-c7e5-4392-8657-8247a50da220.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2013/02/17/SoftwareLeadership1CodeSpeaksLoveTheCode.aspx</link>
      <pubDate>Sun, 17 Feb 2013 16:38:29 GMT</pubDate>
      <description>&lt;p&gt;
   The best people in software have an innate ability to communicate using code. They
   have an idea and simply code it up, thereby making it reality. In fact, the best people
   are, I would say, obsessed with code.
&lt;/p&gt;
&lt;p&gt;
   Pick somebody in software that has done great things. Bill Gates comes to mind for
   me, because that’s who inspired me to get started in software. He wrote code for as
   long as he could manage, and famously delivered code reviews even as his company grew
   to 1,000s of engineers. No matter who you pick, I am sure one thing rings true: they
   obsessed over the details. And when it comes to software, those details are in the
   code.
&lt;/p&gt;
&lt;p&gt;
   Those who cannot read and write code must spend all of their time convincing other
   people of their ideas, and are usually sufficiently disconnected from reality (i.e.,
   the code) that their ideas do not work in practice. This is an awful situation to
   be in, particularly at a company whose primary asset is code. Worse, most people voluntarily
   place themselves into this category, particularly over time in their careers, because
   they believe that coding is "not one of their job responsibilities." What rubbish!
&lt;/p&gt;
&lt;p&gt;
   I have three particular pet peeve examples to give.
&lt;/p&gt;
&lt;p&gt;
   The first is what I like to call the "mediocre mid-level manager syndrome." I’ll admit
   that when you manage large enough teams, you have to give up on a bit of coding. I
   will personally never give it up entirely, even as I manage teams of 1,000s of engineers.
   I will always use the product my team is building, I will read the checkins to at
   least understand what’s going on and stay grounded, and, assuming I continue to manage
   groups building development platforms, I will write code using that platform. But
   for managers who assume responsibility for 10 or fewer engineers, there is absolutely
   no excuse for slacking in these areas. It’s just pure laziness, and the teams suffer
   enormously: such teams typically lack "adult supervision" in the area of engineering
   culture, lack a role model, and build wrong and crappy things. In short, such managers
   literally add negative value. I can’t tell you how many "Software Development Leads"
   at large corporations fall into this category. That this is often culturally accepted
   is totally broken; needless to say, it is not acceptable on my teams.
&lt;/p&gt;
&lt;p&gt;
   The second is something I call "code is beneath me." The two most prominent examples
   are folks late in their careers and researchers. The former often goes hand in hand
   with the mid-level manager problem. But I’ve seen it afflict software engineers too:
   "I’ve been a professional developer for 10 years, so my job is now to tell others
   what to do rather than doing anything myself." At this point, they might adopt the
   title Architect. The research issue, however, quite frankly perplexes me. Computer
   Science is an odd mixture of pure math and applied engineering. I get that many CS
   researchers are more math-oriented, and wish to basically do mathematics rather than
   software. I also get that much of this research bears fruit. But in my experience
   there is a very large contingency of researchers that do not produce "first rate mathematics,"
   and yet resist becoming grounded in code. The idea that you can improve the state
   of software, whose bloodline is code, without ever writing a line of it or becoming
   proficient in it, is complete insanity. And yet it’s generally accepted.
&lt;/p&gt;
&lt;p&gt;
   The last example, which is close to home since I made this mistake myself for a couple
   years early in my career, is "I manage things, I don’t build them." The title of Program
   Manager is a specific manifestation of this problem. Most have backgrounds in CS,
   and have probably written a little code. But most PMs are also usually not very good
   at it, don’t love it, and probably haven’t written much since leaving school. And
   yet these people are often "in charge" of making decisions about features, prioritization,
   and competitive offerings. It’s true that some people have great intuition and can
   make some good decisions without knowing how things work. But when it comes to software,
   those abilities need to be grounded by the code. If that’s not interesting to them,
   I always encourage considering positions in sales and marketing, HR, or one of the
   other organizations in such companies that isn’t focused on actually building the
   software. I would literally abolish the PM position at my company if I could. Those
   who love code should become developers.
&lt;/p&gt;
&lt;p&gt;
   I have the utmost respect for people who have fallen into any of these traps, but
   then realizes it and gets out. Hey, I did so myself.
&lt;/p&gt;
&lt;p&gt;
   Even those that write code often don’t do it enough. I’ve seen so many fall into the
   trap of debating whether or not something would work, or how elegant it would be.
   Certain people are afraid of failure, or find it difficult to get motivated to "start"
   coding. The best people, however, realize that questions are easily answered by writing
   the code in prototype form. They go from 0-60 in an instant, having a vision of what
   they would like to build, and letting nothing get in the way. I call this "oozing
   code from your fingertips." I do think some of this is a skillset thing. In software,
   the top 20% are easily 50X more productive than the bottom 20%. But I also think these
   traits can be learned, given role models who exhibit and demonstrate the behavior.
&lt;/p&gt;
&lt;p&gt;
   Finally, I do encourage software leaders to read as much code as they can. Reading
   code is a great way to learn how things work, and to stay on top of what’s actually
   happening in your project. And it keeps your mind fresh, and often leads to new ideas.
&lt;/p&gt;
&lt;p&gt;
   If it isn’t obvious, I might have a slightly atypical bias here. But it’s one of the
   things I am most passionate about with respect to running software teams. Code speaks.
   Love the code.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=b7eadb39-c7e5-4392-8657-8247a50da220" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=3e6acc99-210b-4f46-92a4-d297334acafe</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,3e6acc99-210b-4f46-92a4-d297334acafe.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      I’ve been managing software teams for several years. Perhaps more importantly, I have
      worked for some excellent leaders and have had the opportunity to learn from their
      good (and bad) habits.
   </p>
        <p>
      Because I haven’t written a line of .NET code in a few years now, that blogging well
      has kind of run dry. And sadly my team is not yet ready to openly share our platform
      externally, so I cannot blog about that either.
   </p>
        <p>
      As a result, I thought it would be fun to start a series about leadership in software.
      Not just the kind of leadership expected of managers, but also individual developers
      and architects. I have no idea how frequently I’ll write something, however just having
      a continuum of content to contribute to when I have a spare moment will help liven
      this place back up again, I’m sure. Furthermore, one lesson that’s been imparted upon
      me over the years is that "writing is thinking"; so by writing this stuff down, I’m
      sure it will crystalize even further.
   </p>
        <p>
      The series will be called "Software Leadership" because, after all, it’s about the
      software. I hope you enjoy.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=3e6acc99-210b-4f46-92a4-d297334acafe" />
      </body>
      <title>Software Leadership series</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,3e6acc99-210b-4f46-92a4-d297334acafe.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2013/02/17/SoftwareLeadershipSeries.aspx</link>
      <pubDate>Sun, 17 Feb 2013 16:35:30 GMT</pubDate>
      <description>&lt;p&gt;
   I’ve been managing software teams for several years. Perhaps more importantly, I have
   worked for some excellent leaders and have had the opportunity to learn from their
   good (and bad) habits.
&lt;/p&gt;
&lt;p&gt;
   Because I haven’t written a line of .NET code in a few years now, that blogging well
   has kind of run dry. And sadly my team is not yet ready to openly share our platform
   externally, so I cannot blog about that either.
&lt;/p&gt;
&lt;p&gt;
   As a result, I thought it would be fun to start a series about leadership in software.
   Not just the kind of leadership expected of managers, but also individual developers
   and architects. I have no idea how frequently I’ll write something, however just having
   a continuum of content to contribute to when I have a spare moment will help liven
   this place back up again, I’m sure. Furthermore, one lesson that’s been imparted upon
   me over the years is that "writing is thinking"; so by writing this stuff down, I’m
   sure it will crystalize even further.
&lt;/p&gt;
&lt;p&gt;
   The series will be called "Software Leadership" because, after all, it’s about the
   software. I hope you enjoy.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=3e6acc99-210b-4f46-92a4-d297334acafe" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=92e759dd-f735-4e9d-9379-5b7caebd5132</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,92e759dd-f735-4e9d-9379-5b7caebd5132.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      I <a href="http://wwwbluebytesoftware.com/blog/2012/10/28/UniquenessAndReferenceImmutabilityForSafeParallelism.aspx">mentioned
      recently</a> that a paper from my team appeared at OOPSLA in October:
   </p>
        <blockquote>
          <p>
            <b>Uniqueness and Reference Immutability for Safe Parallelism</b> (<a href="http://dl.acm.org/citation.cfm?id=2384619">ACM</a>, <a href="http://research.microsoft.com/pubs/170528/msr-tr-2012-79.pdf">MSR
      Tech Report</a> [PDF])
   </p>
        </blockquote>
        <p>
      It's refreshing that we were able to release it. Our project only occasionally gets
      a public shout-out, usually when something leaks by accident. But this time it was
      intentional.
   </p>
        <p>
      I began the language work described about 5 years ago, and it's taken several turns
      of the crank to get to a good point. (Hint: several more than even what you see in
      the paper.) Given the novel proof work in collaboration with our intern, folks in
      MSR, and a visiting professor expert in the area, however, it seemed like a good checkpoint
      that would be sufficiently interesting to release to the public. Perhaps some day
      Microsoft's development community will get to try it out in earnest.
   </p>
        <p>
      There seems to have been some confusion over the goals of this work. I wanted to take
      a moment to clear the air.
   </p>
        <p>
      First, despite assertions elsewhere, the primary focus of this work was not "implicit
      parallelism." Instead, I would summarize our goals as:
   </p>
        <ol>
          <li>
         Create a single language that incorporates the best of functional and imperative programming.
         Both offer distinct advantages, and we envisioned marrying the two. 
      </li>
          <li>
         Codify and statically enforce common shared-memory state patterns, such as immutability
         and isolation, with minimal runtime overhead (i.e., virtually none). 
      </li>
          <li>
         Provide a language model atop which can be built provably safe data, task, and actor-oriented
         concurrency abstractions. Both implicit and explicit. This includes, but is not limtied
         to, parallelism. 
      </li>
          <li>
         Do all of this while still offering industry-leading code quality and performance,
         rivaling that of systems-level C programs. Yet still with the safety implied by the
         abovementioned goals. 
      </li>
        </ol>
        <p>
      The language features in the paper are a vast subset of the full suite needed to achieve
      our overall project goals. However, these alone have exceeded our original expectations.
   </p>
        <p>
      I've programmed a great deal in functional languages. I'm a long-time lover of LISP
      and ML, and my closest friends know about my hard-core dedication to Haskell (expressed
      in an admittedly odd manner). In fact, Haskell's elegant marriage of pure functional
      programming with monads, notably the state monad, was a major inspiration for the
      design of the type system. There are of course many other influences, such as regions,
      linear types, affine types, etc.; however, I'd say Haskell was the strongest.
   </p>
        <p>
      In some sense, we have simply taken the reverse angle of Haskell with its monads:
      what would it be like to embed pure functional programming within an otherwise imperative
      language?
   </p>
        <p>
      This first goal is proving to be my fondest aspect of the language. The ability to
      have "pockets of imperative mutability," familiar to programmers with C, C++, C#,
      and Java backgrounds, connected by a "functional tissue," is not only clarifying,
      but works quite well in practice for building large and complex concurrent systems.
      It turns out many systems follow this model. Concurrent Haskell shares this high-level
      architecture, as does Erlang. Well-written C# systems do the same, though the language
      doesn't (yet) help you to get it right.
   </p>
        <p>
      Of course, as called out by the second goal, immutability and controlled side-effects
      are tremendously useful features on their own. Novel optimizations abound.
   </p>
        <p>
      And it helps programmers declare and verify their intent. As mentioned in the paper,
      we have found/prevented many significant bugs this way. Did you ever want to verify
      that your contracts and assertions are pure, such that conditional compilation doesn't
      change the outcome of your program? Or that your sort comparator isn't mutating the
      elements while performing its comparisons? Neither has much to do with concurrency,
      although the latter facilitates parallel sorts. Many other systems introduce specific
      verification techniques to address specific problems, rather than employing a general
      purpose type system.
   </p>
        <p>
      I would say the strength with respect to concurrency is not the type system itself,
      but rather what you can do with it.
   </p>
        <p>
      The focus on implicit parallelism in the recent forum discussions was unfortunate.
      I guess "implicit parallelism" just makes for catchy and controversial titles. Yes,
      the type system makes implicit parallelism "safe and possible," some forms of which
      are indeed profitable, but it's not as though suddenly all of your for loops are going
      to run 8-times faster after a recompile. The optimization angle is an orthogonal,
      but very real, concern. There are <a href="http://www.cs.cmu.edu/~scandal/nesl.html">decades
      of research and experience</a> here.
   </p>
        <p>
      Even when tasks are explicitly spawned, however, the fact that the type system catches
      unsafe mutable state capture that would lead to race conditions is, I dare say, game
      changing. I could never go back to the old model of instruction-level races, which
      now-a-days feels like programming a PDP6 to me (no insults implied). And yes, data
      parallel works great in this model. It may take a bit of imagination, rereading the
      article, and perhaps looking at related work such as <a href="http://dpj.cs.uiuc.edu/DPJ/Home.html">Deterministic
      Parallel Java</a>, to understand how, but it does.
   </p>
        <p>
      The effort grew out of my work on Software Transactional Memory in 2004, then Parallel
      Extensions (TPL and PLINQ), and then <a href="http://www.bluebytesoftware.com/books/winconc/winconc_book_resources.html">my
      book</a>, a few years later. I had grown frustrated that our programming languages
      didn't help us write correct concurrent code. Instead, these systems simply keep offering
      more and more unsafe building blocks and synchronization primitives. Although I admit
      to contributing to the mess, it continues to this day. How many flavors of tasks and
      blocking queues does the world need? I was also dismayed by the oft-cited "functional
      programming cures everything" mantra, which clearly isn't true: most languages, Haskell
      aside, still offer mutability. And few of them track said mutability in a way that
      is visible to the type system (Haskell, again, being the exception). This means that
      races are still omnipresent, and thus concurrent programs expensive and error prone
      to write and maintain.
   </p>
        <p>
      Reflecting back, I am somewhat amazed that the language has taken so long to hatch.
      Type systems that are sound and strike the right balance of utility and approachability
      are hard work!
   </p>
        <p>
      I am ecstatic that we've been able to make inroads towards solving these hard problems.
      My team is, quite simply, an amazing group of people, and without them the ideas would
      have never made it beyond the "that will never work" phase. I look forward to sharing
      more about our work in the years to come.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=92e759dd-f735-4e9d-9379-5b7caebd5132" />
      </body>
      <title>Imperative + Functional == :-)</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,92e759dd-f735-4e9d-9379-5b7caebd5132.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2012/12/08/ImperativeFunctional.aspx</link>
      <pubDate>Sat, 08 Dec 2012 19:42:47 GMT</pubDate>
      <description>&lt;p&gt;
   I &lt;a href="http://wwwbluebytesoftware.com/blog/2012/10/28/UniquenessAndReferenceImmutabilityForSafeParallelism.aspx"&gt;mentioned
   recently&lt;/a&gt; that a paper from my team appeared at OOPSLA in October:
&lt;/p&gt;
&lt;blockquote&gt; 
&lt;p&gt;
   &lt;b&gt;Uniqueness and Reference Immutability for Safe Parallelism&lt;/b&gt; (&lt;a href="http://dl.acm.org/citation.cfm?id=2384619"&gt;ACM&lt;/a&gt;, &lt;a href="http://research.microsoft.com/pubs/170528/msr-tr-2012-79.pdf"&gt;MSR
   Tech Report&lt;/a&gt; [PDF])
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
   It's refreshing that we were able to release it. Our project only occasionally gets
   a public shout-out, usually when something leaks by accident. But this time it was
   intentional.
&lt;/p&gt;
&lt;p&gt;
   I began the language work described about 5 years ago, and it's taken several turns
   of the crank to get to a good point. (Hint: several more than even what you see in
   the paper.) Given the novel proof work in collaboration with our intern, folks in
   MSR, and a visiting professor expert in the area, however, it seemed like a good checkpoint
   that would be sufficiently interesting to release to the public. Perhaps some day
   Microsoft's development community will get to try it out in earnest.
&lt;/p&gt;
&lt;p&gt;
   There seems to have been some confusion over the goals of this work. I wanted to take
   a moment to clear the air.
&lt;/p&gt;
&lt;p&gt;
   First, despite assertions elsewhere, the primary focus of this work was not "implicit
   parallelism." Instead, I would summarize our goals as:
&lt;/p&gt;
&lt;ol&gt;
   &lt;li&gt;
      Create a single language that incorporates the best of functional and imperative programming.
      Both offer distinct advantages, and we envisioned marrying the two. 
   &lt;li&gt;
      Codify and statically enforce common shared-memory state patterns, such as immutability
      and isolation, with minimal runtime overhead (i.e., virtually none). 
   &lt;li&gt;
      Provide a language model atop which can be built provably safe data, task, and actor-oriented
      concurrency abstractions. Both implicit and explicit. This includes, but is not limtied
      to, parallelism. 
   &lt;li&gt;
      Do all of this while still offering industry-leading code quality and performance,
      rivaling that of systems-level C programs. Yet still with the safety implied by the
      abovementioned goals. 
&lt;/ol&gt;
&lt;p&gt;
   The language features in the paper are a vast subset of the full suite needed to achieve
   our overall project goals. However, these alone have exceeded our original expectations.
&lt;/p&gt;
&lt;p&gt;
   I've programmed a great deal in functional languages. I'm a long-time lover of LISP
   and ML, and my closest friends know about my hard-core dedication to Haskell (expressed
   in an admittedly odd manner). In fact, Haskell's elegant marriage of pure functional
   programming with monads, notably the state monad, was a major inspiration for the
   design of the type system. There are of course many other influences, such as regions,
   linear types, affine types, etc.; however, I'd say Haskell was the strongest.
&lt;/p&gt;
&lt;p&gt;
   In some sense, we have simply taken the reverse angle of Haskell with its monads:
   what would it be like to embed pure functional programming within an otherwise imperative
   language?
&lt;/p&gt;
&lt;p&gt;
   This first goal is proving to be my fondest aspect of the language. The ability to
   have "pockets of imperative mutability," familiar to programmers with C, C++, C#,
   and Java backgrounds, connected by a "functional tissue," is not only clarifying,
   but works quite well in practice for building large and complex concurrent systems.
   It turns out many systems follow this model. Concurrent Haskell shares this high-level
   architecture, as does Erlang. Well-written C# systems do the same, though the language
   doesn't (yet) help you to get it right.
&lt;/p&gt;
&lt;p&gt;
   Of course, as called out by the second goal, immutability and controlled side-effects
   are tremendously useful features on their own. Novel optimizations abound.
&lt;/p&gt;
&lt;p&gt;
   And it helps programmers declare and verify their intent. As mentioned in the paper,
   we have found/prevented many significant bugs this way. Did you ever want to verify
   that your contracts and assertions are pure, such that conditional compilation doesn't
   change the outcome of your program? Or that your sort comparator isn't mutating the
   elements while performing its comparisons? Neither has much to do with concurrency,
   although the latter facilitates parallel sorts. Many other systems introduce specific
   verification techniques to address specific problems, rather than employing a general
   purpose type system.
&lt;/p&gt;
&lt;p&gt;
   I would say the strength with respect to concurrency is not the type system itself,
   but rather what you can do with it.
&lt;/p&gt;
&lt;p&gt;
   The focus on implicit parallelism in the recent forum discussions was unfortunate.
   I guess "implicit parallelism" just makes for catchy and controversial titles. Yes,
   the type system makes implicit parallelism "safe and possible," some forms of which
   are indeed profitable, but it's not as though suddenly all of your for loops are going
   to run 8-times faster after a recompile. The optimization angle is an orthogonal,
   but very real, concern. There are &lt;a href="http://www.cs.cmu.edu/~scandal/nesl.html"&gt;decades
   of research and experience&lt;/a&gt; here.
&lt;/p&gt;
&lt;p&gt;
   Even when tasks are explicitly spawned, however, the fact that the type system catches
   unsafe mutable state capture that would lead to race conditions is, I dare say, game
   changing. I could never go back to the old model of instruction-level races, which
   now-a-days feels like programming a PDP6 to me (no insults implied). And yes, data
   parallel works great in this model. It may take a bit of imagination, rereading the
   article, and perhaps looking at related work such as &lt;a href="http://dpj.cs.uiuc.edu/DPJ/Home.html"&gt;Deterministic
   Parallel Java&lt;/a&gt;, to understand how, but it does.
&lt;/p&gt;
&lt;p&gt;
   The effort grew out of my work on Software Transactional Memory in 2004, then Parallel
   Extensions (TPL and PLINQ), and then &lt;a href="http://www.bluebytesoftware.com/books/winconc/winconc_book_resources.html"&gt;my
   book&lt;/a&gt;, a few years later. I had grown frustrated that our programming languages
   didn't help us write correct concurrent code. Instead, these systems simply keep offering
   more and more unsafe building blocks and synchronization primitives. Although I admit
   to contributing to the mess, it continues to this day. How many flavors of tasks and
   blocking queues does the world need? I was also dismayed by the oft-cited "functional
   programming cures everything" mantra, which clearly isn't true: most languages, Haskell
   aside, still offer mutability. And few of them track said mutability in a way that
   is visible to the type system (Haskell, again, being the exception). This means that
   races are still omnipresent, and thus concurrent programs expensive and error prone
   to write and maintain.
&lt;/p&gt;
&lt;p&gt;
   Reflecting back, I am somewhat amazed that the language has taken so long to hatch.
   Type systems that are sound and strike the right balance of utility and approachability
   are hard work!
&lt;/p&gt;
&lt;p&gt;
   I am ecstatic that we've been able to make inroads towards solving these hard problems.
   My team is, quite simply, an amazing group of people, and without them the ideas would
   have never made it beyond the "that will never work" phase. I look forward to sharing
   more about our work in the years to come.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=92e759dd-f735-4e9d-9379-5b7caebd5132" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=ae7274ea-47f3-46e1-857d-f70361544ae4</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,ae7274ea-47f3-46e1-857d-f70361544ae4.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      .NET holds an enormous advantage over C++.
   </p>
        <p>
      Well, okay, there are a few, but I’m thinking about one in particular: A single string
      type.
   </p>
        <p>
      What’s not to love about that? Anybody who has done more than an hour’s worth of Windows
      programming in C++ should appreciate this feature. No more zero-terminated char* vs.
      length-prefixed char* vs. BSTR vs. wchar_t vs. CStringA vs. CStringW vs. CComBSTR.
      Just System.String. Hurray!
   </p>
        <p>
      There’s one very specific thing not to love, however: The ease with which you can
      allocate a new one.
   </p>
        <p>
      I’ve been working in an environment where performance is critical, and everything
      is managed code, for several years now. That might sound like an oxymoron, but our
      system can in fact beat the pants off all the popular native programming environments.
      The key to success? Thought and discipline.
   </p>
        <p>
      We, in fact, love our single string type. And yet our team has learned (the hard way)
      that string allocations, while seemingly innocuous and small, spell certain death.
   </p>
        <p>
      It may seem strange to pick on string. There are dozens of other objects you might
      allocate, like custom data types, arrays, lists, and whatnot. But there tend to be
      many core infrastructural pieces that deal with string manipulation, and if you build
      atop the wrong abstractions then things are sure to go wrong.
   </p>
        <p>
      Imagine a web stack. It’s all about string parsing and processing. And anything to
      do with distributed processing of data is most likely going to involve strings at
      some level. Etc.
   </p>
        <p>
      There are landmine APIs lurking out there, like String.Split and String.Substring.
      Even if you’ve got an interned string in hand (often rare in a server environment
      where strings are built from dynamically produced data), using these APIs will allocate
      boatloads of tiny little strings. And boatloads of tiny little strings means collections.
   </p>
        <p>
      For example, imagine I just want to perform some action for each substring in a comma-delimited
      string. I could of course write it as follows: 
   </p>
        <p>
        </p>
        <pre>
          <code>string str = ...; string[] substrs = str.Split(','); foreach (string
   subtr in substrs) { Process(substr); }
   </code>
        </pre>
        <p>
      Or I could write it as follows:
   </p>
        <p>
        </p>
        <pre>
          <code>string str = ...; int lastIndex = 0; int commaIndex; while ((commaIndex
   = str.IndexOf(',', commaIndex)) != -1) { Process(substr, lastIndex, commaIndex); lastIndex
   = commaIndex + 1; }
   </code>
        </pre>
        <p>
      The latter certainly requires a bit more thought. That’s primarily because .NET doesn’t
      have an efficient notion of substring – creating one requires an allocation. But the
      performance difference is night and day. The first one allocates an array and individual
      substrings, whereas the second performs no allocations. If this is, say, parsing HTTP
      headers on a heavily loaded server, you bet it’s going to make a noticeable difference.
   </p>
        <p>
      Honestly, I’ve witnessed programs that should be I/O bound turn into programs that
      are compute-bound, simply due to use of inefficient string parsing routines across
      enormous amounts of data. (Okay, the developers also did other sloppy allocation-heavy
      things, but string certainly contributed.) Remember, many managed programs must compete
      with C++, where developers are accustomed to being more thoughtful about allocations
      in the context of parsing. Mainly because it’s such a pain in the ass to managed ad-hoc
      allocation lifetimes, versus in-place or stack-based parsing where it’s trivial.
   </p>
        <p>
      "But gen0 collections are free," you might say. Sure, they are cheaper than gen1 and
      gen2 collections, but they are most certainly not free. Each collection is a linked
      list traversal that executes a nontrivial number of instructions and trashes your
      cache. It’s true that generational collectors minimize the pain, but they do not completely
      eliminate it. This, I think, is one of the biggest fallacies that plagues managed
      code to this day. Developers who treat the GC like their zero-cost scratch pad end
      up creating abstractions that poison the well for everybody.
   </p>
        <p>
      Crank up .NET’s XmlReader and profile loading a modest XML document. You’ll be surprised
      to see that allocations during parsing add up to approximately 4X the document’s size.
      Many of these are strings. How did we end up in such a place? Presumably because whoever
      wrote these abstractions fell trap to the fallacy that "gen0 collections are free."
      But also because layers upon layers of such things lie beneath.
   </p>
        <p>
      It doesn’t have to be this way. String does, after all, have an indexer. And it’s
      type-safe! So in-place parsing at least won’t lead to buffer overruns. Sadly, I have
      concluded that few people, at least in the context of .NET, will write efficient string
      parsing code. The whole platform is written to assume that strings are available,
      and does not have an efficient representation of a transient substring. And of course
      the APIs have been designed to coax you into making copy after copy, rather than doing
      efficient text manipulation in place. Hell, even the HTTP and ASP.NET web stacks are
      rife with such inefficiencies.
   </p>
        <p>
      In certain arenas, doing all of this efficiently actually pays the bills. In others
      arenas, it doesn’t, and I suppose it’s possible to ignore all of this and let the
      GC chew up 30% or more of your program’s execution time without anybody noticing.
      I’m baffled that such software is written, but at the same time I realize that my
      expectations are out of whack with respect to common practice.
   </p>
        <p>
      The moral of the story? Love your single string type. It’s a wonderful thing. But
      always remember: An allocation is an allocation; make sure you can afford it. Gen0
      collections aren’t free, and software written to assume they are is easily detectible.
      String.Split allocates an array and a substring for each element within; there’s almost
      always a better way.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=ae7274ea-47f3-46e1-857d-f70361544ae4" />
      </body>
      <title>Beware the string</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,ae7274ea-47f3-46e1-857d-f70361544ae4.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2012/10/31/BewareTheString.aspx</link>
      <pubDate>Wed, 31 Oct 2012 02:43:31 GMT</pubDate>
      <description>&lt;p&gt;
   .NET holds an enormous advantage over C++.
&lt;/p&gt;
&lt;p&gt;
   Well, okay, there are a few, but I’m thinking about one in particular: A single string
   type.
&lt;/p&gt;
&lt;p&gt;
   What’s not to love about that? Anybody who has done more than an hour’s worth of Windows
   programming in C++ should appreciate this feature. No more zero-terminated char* vs.
   length-prefixed char* vs. BSTR vs. wchar_t vs. CStringA vs. CStringW vs. CComBSTR.
   Just System.String. Hurray!
&lt;/p&gt;
&lt;p&gt;
   There’s one very specific thing not to love, however: The ease with which you can
   allocate a new one.
&lt;/p&gt;
&lt;p&gt;
   I’ve been working in an environment where performance is critical, and everything
   is managed code, for several years now. That might sound like an oxymoron, but our
   system can in fact beat the pants off all the popular native programming environments.
   The key to success? Thought and discipline.
&lt;/p&gt;
&lt;p&gt;
   We, in fact, love our single string type. And yet our team has learned (the hard way)
   that string allocations, while seemingly innocuous and small, spell certain death.
&lt;/p&gt;
&lt;p&gt;
   It may seem strange to pick on string. There are dozens of other objects you might
   allocate, like custom data types, arrays, lists, and whatnot. But there tend to be
   many core infrastructural pieces that deal with string manipulation, and if you build
   atop the wrong abstractions then things are sure to go wrong.
&lt;/p&gt;
&lt;p&gt;
   Imagine a web stack. It’s all about string parsing and processing. And anything to
   do with distributed processing of data is most likely going to involve strings at
   some level. Etc.
&lt;/p&gt;
&lt;p&gt;
   There are landmine APIs lurking out there, like String.Split and String.Substring.
   Even if you’ve got an interned string in hand (often rare in a server environment
   where strings are built from dynamically produced data), using these APIs will allocate
   boatloads of tiny little strings. And boatloads of tiny little strings means collections.
&lt;/p&gt;
&lt;p&gt;
   For example, imagine I just want to perform some action for each substring in a comma-delimited
   string. I could of course write it as follows: 
&lt;p&gt;
&lt;pre&gt;&lt;code&gt;string str = ...; string[] substrs = str.Split(','); foreach (string subtr
in substrs) { Process(substr); }
&lt;/pre&gt;
&gt;&gt;
&lt;p&gt;
   Or I could write it as follows:
&lt;/p&gt;
&lt;p&gt;
&lt;pre&gt;&lt;code&gt;string str = ...; int lastIndex = 0; int commaIndex; while ((commaIndex
= str.IndexOf(',', commaIndex)) != -1) { Process(substr, lastIndex, commaIndex); lastIndex
= commaIndex + 1; }
&lt;/pre&gt;
&gt;&gt;
&lt;p&gt;
   The latter certainly requires a bit more thought. That’s primarily because .NET doesn’t
   have an efficient notion of substring – creating one requires an allocation. But the
   performance difference is night and day. The first one allocates an array and individual
   substrings, whereas the second performs no allocations. If this is, say, parsing HTTP
   headers on a heavily loaded server, you bet it’s going to make a noticeable difference.
&lt;/p&gt;
&lt;p&gt;
   Honestly, I’ve witnessed programs that should be I/O bound turn into programs that
   are compute-bound, simply due to use of inefficient string parsing routines across
   enormous amounts of data. (Okay, the developers also did other sloppy allocation-heavy
   things, but string certainly contributed.) Remember, many managed programs must compete
   with C++, where developers are accustomed to being more thoughtful about allocations
   in the context of parsing. Mainly because it’s such a pain in the ass to managed ad-hoc
   allocation lifetimes, versus in-place or stack-based parsing where it’s trivial.
&lt;/p&gt;
&lt;p&gt;
   "But gen0 collections are free," you might say. Sure, they are cheaper than gen1 and
   gen2 collections, but they are most certainly not free. Each collection is a linked
   list traversal that executes a nontrivial number of instructions and trashes your
   cache. It’s true that generational collectors minimize the pain, but they do not completely
   eliminate it. This, I think, is one of the biggest fallacies that plagues managed
   code to this day. Developers who treat the GC like their zero-cost scratch pad end
   up creating abstractions that poison the well for everybody.
&lt;/p&gt;
&lt;p&gt;
   Crank up .NET’s XmlReader and profile loading a modest XML document. You’ll be surprised
   to see that allocations during parsing add up to approximately 4X the document’s size.
   Many of these are strings. How did we end up in such a place? Presumably because whoever
   wrote these abstractions fell trap to the fallacy that "gen0 collections are free."
   But also because layers upon layers of such things lie beneath.
&lt;/p&gt;
&lt;p&gt;
   It doesn’t have to be this way. String does, after all, have an indexer. And it’s
   type-safe! So in-place parsing at least won’t lead to buffer overruns. Sadly, I have
   concluded that few people, at least in the context of .NET, will write efficient string
   parsing code. The whole platform is written to assume that strings are available,
   and does not have an efficient representation of a transient substring. And of course
   the APIs have been designed to coax you into making copy after copy, rather than doing
   efficient text manipulation in place. Hell, even the HTTP and ASP.NET web stacks are
   rife with such inefficiencies.
&lt;/p&gt;
&lt;p&gt;
   In certain arenas, doing all of this efficiently actually pays the bills. In others
   arenas, it doesn’t, and I suppose it’s possible to ignore all of this and let the
   GC chew up 30% or more of your program’s execution time without anybody noticing.
   I’m baffled that such software is written, but at the same time I realize that my
   expectations are out of whack with respect to common practice.
&lt;/p&gt;
&lt;p&gt;
   The moral of the story? Love your single string type. It’s a wonderful thing. But
   always remember: An allocation is an allocation; make sure you can afford it. Gen0
   collections aren’t free, and software written to assume they are is easily detectible.
   String.Split allocates an array and a substring for each element within; there’s almost
   always a better way.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=ae7274ea-47f3-46e1-857d-f70361544ae4" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=31bf3404-281e-42b5-bd2b-0954c2170c32</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,31bf3404-281e-42b5-bd2b-0954c2170c32.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      A glimpse of some research we've done recently just appeared <a href="http://splashcon.org/2012/schedule/tuesday-oct-23/462">at
      OOPSLA last week</a>:
   </p>
        <blockquote>
          <p>
            <b>Uniqueness and Reference Immutability for Safe Parallelism</b>
          </p>
          <p>
            <i>A key challenge for concurrent programming is that side-effects (memory operations)
      in one thread can affect the behavior of another thread. In this paper, we present
      a type system to restrict the updates to memory to prevent these unintended side-effects.
      We provide a novel combination of immutable and unique (isolated) types that ensures
      safe parallelism (race freedom and deterministic execution). The type system includes
      support for polymorphism over type qualifiers, and can easily create cycles of immutable
      objects. Key to the system's flexibility is the ability to recover immutable or externally
      unique references after violating uniqueness without any explicit alias tracking.
      Our type system models a prototype extension to C# that is in active use by a Microsoft
      team. We describe their experiences building large systems with this extension. We
      prove the soundness of the type system by an embedding into a program logic.</i>
          </p>
        </blockquote>
        <p>
      The official ACM page is <a href="http://dl.acm.org/citation.cfm?id=2384619">here</a>,
      and a tech report version is available on <a href="http://research.microsoft.com/apps/pubs/default.aspx?id=170528">MSR's
      website</a>.
   </p>
        <p>
      As I said, this is just a glimpse. Its focus was mainly on the type soundness work
      we've done jointly with MSR, and less about the language, syntax, and uses. You'll
      have to use your imagination to fill in the rest ;-)
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=31bf3404-281e-42b5-bd2b-0954c2170c32" />
      </body>
      <title>Uniqueness and Reference Immutability for Safe Parallelism</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,31bf3404-281e-42b5-bd2b-0954c2170c32.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2012/10/28/UniquenessAndReferenceImmutabilityForSafeParallelism.aspx</link>
      <pubDate>Sun, 28 Oct 2012 22:58:30 GMT</pubDate>
      <description>&lt;p&gt;
   A glimpse of some research we've done recently just appeared &lt;a href="http://splashcon.org/2012/schedule/tuesday-oct-23/462"&gt;at
   OOPSLA last week&lt;/a&gt;:
&lt;/p&gt;
&lt;blockquote&gt; 
&lt;p&gt;
   &lt;b&gt;Uniqueness and Reference Immutability for Safe Parallelism&lt;/b&gt;
&lt;/p&gt;
&lt;p&gt;
   &lt;i&gt;A key challenge for concurrent programming is that side-effects (memory operations)
   in one thread can affect the behavior of another thread. In this paper, we present
   a type system to restrict the updates to memory to prevent these unintended side-effects.
   We provide a novel combination of immutable and unique (isolated) types that ensures
   safe parallelism (race freedom and deterministic execution). The type system includes
   support for polymorphism over type qualifiers, and can easily create cycles of immutable
   objects. Key to the system's flexibility is the ability to recover immutable or externally
   unique references after violating uniqueness without any explicit alias tracking.
   Our type system models a prototype extension to C# that is in active use by a Microsoft
   team. We describe their experiences building large systems with this extension. We
   prove the soundness of the type system by an embedding into a program logic.&lt;/i&gt;
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
   The official ACM page is &lt;a href="http://dl.acm.org/citation.cfm?id=2384619"&gt;here&lt;/a&gt;,
   and a tech report version is available on &lt;a href="http://research.microsoft.com/apps/pubs/default.aspx?id=170528"&gt;MSR's
   website&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;
   As I said, this is just a glimpse. Its focus was mainly on the type soundness work
   we've done jointly with MSR, and less about the language, syntax, and uses. You'll
   have to use your imagination to fill in the rest ;-)
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=31bf3404-281e-42b5-bd2b-0954c2170c32" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=9113a9aa-d98b-446c-b9ae-63d1926d2a27</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,9113a9aa-d98b-446c-b9ae-63d1926d2a27.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <title>Work fun</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,9113a9aa-d98b-446c-b9ae-63d1926d2a27.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2012/07/18/WorkFun.aspx</link>
      <pubDate>Wed, 18 Jul 2012 00:49:08 GMT</pubDate>
      <description>&lt;p&gt;
   It's been quite some time since I blogged.
&lt;/p&gt;
&lt;p&gt;
   The reason is simple, as always: I'm having the time of my life at work.
&lt;/p&gt;
&lt;p&gt;
   I will try to do better blogwise in the coming months. But in the meantime ...
&lt;/p&gt;
&lt;p&gt;
   If you'd like to join me, &lt;a href="http://blog.adamnathan.net/"&gt;Adam&lt;/a&gt;, and &lt;a href="http://blogs.msdn.com/b/kcwalina/"&gt;Krzysztof&lt;/a&gt; in
   our quest to build the best APIs and developer platform known to mankind, please &lt;a href="mailto:joedu@microsoft.com"&gt;shoot
   me an email with you resume&lt;/a&gt;. Or just &lt;a href="https://careers.microsoft.com/jobdetails.aspx?ss=&amp;pg=0&amp;so=&amp;rw=1&amp;jid=74196&amp;jlang=EN"&gt;apply&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;
   In short, you could be having the time of your life too. Why wait?
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=9113a9aa-d98b-446c-b9ae-63d1926d2a27" /&gt;</description>
      <category>Miscellaneous;Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=e1e621c4-c9d2-47ed-9144-f881a3c00588</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,e1e621c4-c9d2-47ed-9144-f881a3c00588.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      I often wish that .NET had erred on the side of offering postmortem instead of premortem
      finalization.
   </p>
        <p>
      The distinction here is when exactly the finalizer runs, i.e. <em>after</em> or <em>before</em> the
      GC has actually reclaimed an object. This governs whether a dying object is (a) accessible
      from within its own finalizer, and therefore (b) eligible to become resurrected. Postmortem
      finalization occurs after the object is long gone, and hence says “no” to both of
      these questions; premortem finalization happens beforehand and hence says “yes.”
   </p>
        <p>
      .NET chose the latter.
   </p>
        <p>
      The primary downside of premortem finalization, setting aside the confusing nature
      of resurrection, is that the object in question cannot be collected until <em>after</em> its
      finalizer has run. This should be fairly obvious: it is only that second time the
      object is found to be dead “again” that we know the finalizer has or has not resurrected
      it.
   </p>
        <p>
      This may seem like a small matter. But it matters quite a lot when building high performance
      software. In a garbage collected system, relying on high rates of finalization to
      keep up with demanding workloads almost never works. But in a premortem finalization
      system, even moderate demands become cause for concern.
   </p>
        <p>
      Premortem finalization leads to finalized objects getting promoted to the elder generations
      before actually dying. If you check the value of GC.GetGeneration(this) within an
      object’s finalizer, for example, you will notice it is one greater than the generation
      in which the object was found to be dead the first time. Say it was found dead in
      Gen1; then GC.GetGeneration(this) will return ‘2’. Yet another collection must happen,
      in Gen2 to boot, in order to actually reclaim this object. And, of course, it’s not
      just this object, but also the transitive closure of objects to which it refers.
   </p>
        <p>
      This approach penalizes the majority use case of finalizable objects. At least on
      .NET, most objects merely invoke CloseHandle on an IntPtr in the finalizer. This clearly
      needn’t hold up freeing the managed state. And resurrection is a dubious scenario
      anyway: such objects quickly end up in Gen2 where collections are expensive and infrequent.
      If you’re pooling via resurrection because you create expensive objects at a high
      rate of birth and death, manual memory management (or a different design altogether)
      is likely your only savior.
   </p>
        <p>
      Although Java’s finalizers are also premortem, the JDK offers the facilities necessary
      to implement postmortem finalization on your own. It entails using WeakReference<T>
         and ReferenceQueue<T>
            . See <a href="http://java.sun.com/developer/technicalArticles/javase/finalization/">this
            article</a> if you are curious.
         </T></T></p>
        <p>
      .NET doesn’t offer the notifications required to do the same. You can, however, learn
      from postmortem finalization to write better premotem finalizers: prefer simple finalizable
      objects that refer to only the state necessary to implement finalization – which ordinarily
      means no other managed objects. The SafeHandle abstraction is a good example of this.
      Most implementations are comprised of a simple IntPtr. This pattern will ensure that
      collateral promotion due to finalization is more contained.
   </p>
        <p>
      After saying all of this, I hope it is just amusing trivia. I'm sure nobody is writing
      finalizers these days anyway.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=e1e621c4-c9d2-47ed-9144-f881a3c00588" />
      </body>
      <title>A brief note on object mortality</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,e1e621c4-c9d2-47ed-9144-f881a3c00588.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2011/11/12/ABriefNoteOnObjectMortality.aspx</link>
      <pubDate>Sat, 12 Nov 2011 22:03:13 GMT</pubDate>
      <description>&lt;p&gt;
   I often wish that .NET had erred on the side of offering postmortem instead of premortem
   finalization.
&lt;/p&gt;
&lt;p&gt;
   The distinction here is when exactly the finalizer runs, i.e. &lt;em&gt;after&lt;/em&gt; or &lt;em&gt;before&lt;/em&gt; the
   GC has actually reclaimed an object. This governs whether a dying object is (a) accessible
   from within its own finalizer, and therefore (b) eligible to become resurrected. Postmortem
   finalization occurs after the object is long gone, and hence says “no” to both of
   these questions; premortem finalization happens beforehand and hence says “yes.”
&lt;/p&gt;
&lt;p&gt;
   .NET chose the latter.
&lt;/p&gt;
&lt;p&gt;
   The primary downside of premortem finalization, setting aside the confusing nature
   of resurrection, is that the object in question cannot be collected until &lt;em&gt;after&lt;/em&gt; its
   finalizer has run. This should be fairly obvious: it is only that second time the
   object is found to be dead “again” that we know the finalizer has or has not resurrected
   it.
&lt;/p&gt;
&lt;p&gt;
   This may seem like a small matter. But it matters quite a lot when building high performance
   software. In a garbage collected system, relying on high rates of finalization to
   keep up with demanding workloads almost never works. But in a premortem finalization
   system, even moderate demands become cause for concern.
&lt;/p&gt;
&lt;p&gt;
   Premortem finalization leads to finalized objects getting promoted to the elder generations
   before actually dying. If you check the value of GC.GetGeneration(this) within an
   object’s finalizer, for example, you will notice it is one greater than the generation
   in which the object was found to be dead the first time. Say it was found dead in
   Gen1; then GC.GetGeneration(this) will return ‘2’. Yet another collection must happen,
   in Gen2 to boot, in order to actually reclaim this object. And, of course, it’s not
   just this object, but also the transitive closure of objects to which it refers.
&lt;/p&gt;
&lt;p&gt;
   This approach penalizes the majority use case of finalizable objects. At least on
   .NET, most objects merely invoke CloseHandle on an IntPtr in the finalizer. This clearly
   needn’t hold up freeing the managed state. And resurrection is a dubious scenario
   anyway: such objects quickly end up in Gen2 where collections are expensive and infrequent.
   If you’re pooling via resurrection because you create expensive objects at a high
   rate of birth and death, manual memory management (or a different design altogether)
   is likely your only savior.
&lt;/p&gt;
&lt;p&gt;
   Although Java’s finalizers are also premortem, the JDK offers the facilities necessary
   to implement postmortem finalization on your own. It entails using WeakReference&lt;T&gt;
      and ReferenceQueue&lt;T&gt;
         . See &lt;a href="http://java.sun.com/developer/technicalArticles/javase/finalization/"&gt;this
         article&lt;/a&gt; if you are curious.
&lt;/p&gt;
&lt;p&gt;
   .NET doesn’t offer the notifications required to do the same. You can, however, learn
   from postmortem finalization to write better premotem finalizers: prefer simple finalizable
   objects that refer to only the state necessary to implement finalization – which ordinarily
   means no other managed objects. The SafeHandle abstraction is a good example of this.
   Most implementations are comprised of a simple IntPtr. This pattern will ensure that
   collateral promotion due to finalization is more contained.
&lt;/p&gt;
&lt;p&gt;
   After saying all of this, I hope it is just amusing trivia. I'm sure nobody is writing
   finalizers these days anyway.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=e1e621c4-c9d2-47ed-9144-f881a3c00588" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=2d01b5eb-7f19-4a4c-a3b3-1cecb1e3dc75</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,2d01b5eb-7f19-4a4c-a3b3-1cecb1e3dc75.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      It's been unbelievably long since I last blogged.
   </p>
        <p>
      The reason is simple. I've been ecstatic in my job and, every time I think to write
      something, I quickly end up turning to work and soon find that hours (days? months?)
      have passed. This is a wonderful problem to have, but not so good for keeping the
      blog looking fresh and new. (I've also been writing <a href="http://soundcloud.com/45nm_joeduffy/sets/current/">a
      fair bit of music</a> lately.) Well, this weekend I managed to lock myself out of
      my VPN access, and decided that this was a sign that I ought to dust off the cobwebs
      on a blog entry or two that I've had in the works for quite some time.
   </p>
        <p>
      The topic for today is generics, a feature many of us know and love. Specifically,
      their impact on software performance, something I frequently see developers struggling
      to understand and tame in the wild. 
   </p>
        <h1>The blessing; and, the curse
   </h1>
        <p>
      I absolutely love generics. I can hardly imagine writing code without them these days.
      The code reuse, higher-order expressiveness, beautiful abstractions, and static type-safety
      enabled by first class parametric polymorphism are all game-changing. And being a
      language history wonk, I'm delighted to see many mainstream programming languages
      stealing a page from ML and theoretical CS generally.
   </p>
        <p>
      Generics, however, are not free. And in some circumstances, they are, dare I say,
      rather expensive. Few language features surpass generics in the ability to write a
      concise and elegant bit of code, which then translates into reams of ugly assembly
      code out the rear end of the compiler. I am of course speaking mainly to models in
      which compilation leads to code specialization (like .NET's), versus erasure (like
      Java's).
   </p>
        <p>
      Most developers coming from a C++ background understand code expansion deeply, because
      they program with templates. Unlike templates, however, there is ample runtime type
      information (RTTI) associated with generic instantiations… such that the costs associated
      with generics frequently – and perhaps surprisingly – are a superset of those costs
      normally associated with C++ templates. At the same time, because the compiler understands
      parametric polymorphism, it can sometimes do a better job optimizing, e.g. with techniques
      like code sharing.
   </p>
        <p>
      Basically, with templates and erasure, the equation for predicting code expansion
      is super simple. You get it all (in the former) or you get none of it (in the latter),
      but with specialized generics this equation is quite complex.
   </p>
        <p>
      Paradoxically, these same costs are the main value that generics bring to the table!
      Write a little type-agnostic code and then "instantiate" that same code over multiple
      types without repeating yourself. But, generics are not magic; did you ever stop to
      wonder things like: What machine code is generated for these types? Does the compiler
      need to specialize the actual code that runs on the processor for unique instantiations,
      or is it all the same? And if it does need to specialize, where, how, and why? And
      perhaps most importantly, what hidden costs are there, and how should I think about
      them while writing code?
   </p>
        <p>
      Before reading further, paranoia need not ensue. The point of this article is merely
      to raise your awareness. All programmers should know what the abstractions they use
      cost, and make conscious tradeoffs when writing code with them. The aforementioned
      benefits of generics really are often "worth it," both in the elegance and reusability
      of abstractions, and in developer productivity. In my experience, however, the associated
      costs are so subtle and ill-documented that even people who write highly generic code
      typically remain unaware of them. Even more subtly, these costs are somewhat different
      in nature when pre-compiling your code, such as with .NET's NGen technology.
   </p>
        <p>
      This brief essay will walk through a few such costs in the context of the .NET Framework
      and CLR's implementation of generics. This is in no way an exhaustive study of generic
      compilation, and your mileage will vary from one platform to the next. Although the
      studies presented would apply to other implementations of generics, the reality is
      that if you're writing code in, say, Java – where type erasure is employed rather
      than code specialization – then all of this is going to be less relevant to you.
   </p>
        <p>
      With no further delay, let's get started.
   </p>
        <h1>Code, RTTI, oh my
   </h1>
        <p>
      When considering costs, we must always think about both size and speed.
   </p>
        <p>
      There is at least as much assembly code created for an instantiation as the code you've
      written for the generic abstraction in C# or MSIL. A simple mental model – that thankfully
      turns out isn't entirely accurate, thanks to some sharing optimizations described
      below – is that for each instantiation of a generic type or method you get a new copy
      of that code specialized to the type in question. Obviously, this increases code size.
      And just as obviously, it will add some runtime cost to JIT compile the code (if you
      aren't using ahead of time compilation), as well as putting more pressure on I-cache
      and TLB.
   </p>
        <p>
      Another source of significant cost is the runtime data structures needed for RTTI
      and Reflection, like vtables and other metadata. Quite simply, the runtime needs to
      know the identity of each generic instantiation, to prevent things like casting a
      List&lt;Int16&gt; to a List&lt;String&gt;, and even List&lt;Object&gt; to List&lt;String&gt;;
      and given that there is often distinct code generated for unique instantiations, the
      vtable contents for those different List&lt;T&gt; instantiations are going to look
      quite different.
   </p>
        <p>
      And of course, there are statics. Each generic instantiation gets its own set, requiring
      extra storage and another level of indirection when fetching them. Unique statics
      means D-cache and TLB pressure. It turns out that code shared across AppDomains, like
      mscorlib.dll, already need such things. But I have found that it's surprisingly common
      for a developer to throw a static field (or nested class!) onto an outer generic type,
      without actually needing it to be replicated for each unique instantiation.
   </p>
        <p>
      In addition to the immediate effects, generic types often refer to other generic types
      which refer to other generic types … and so on. Instantiating a root type is akin
      to instantiating the full transitive closure.
   </p>
        <p>
      To make our discussion friendly and familiar, we shall use the .NET Framework's List&lt;T&gt;
      type – presumably one of the most commonly used generic types on the planet – to illustrate
      many of these costs. And unfortunately, you'll also see that many of the common performance
      pitfalls plague this type too. (So, really, you need not feel bad if your own code
      is guilty of them too.)
   </p>
        <h1>Why the distinct code, anyway?
   </h1>
        <p>
      There is only one copy of List&lt;T&gt;'s code in mscorlib's MSIL. It is essentially
      just a blueprint for the list class.
   </p>
        <p>
      When I create a List&lt;Int16&gt; in my program and use it, however, there clearly
      needs to be some assembly code created in order to execute List&lt;T&gt;'s associated
      functionality, just with any T's used by List&lt;T&gt;'s code replaced by actual 2-byte
      short integers. And similarly, if I were to instantiate a List&lt;String&gt;, all
      those T's need to be replaced by pointer-sized object references, either 4- or 8-bytes
      depending on machine architecture, that are reported live to the garbage collector.
   </p>
        <p>
      This is what leads to our simple mental model above, in which each instantiation gets
      its own copy of the code. In this case, both List&lt;Int16&gt; and List&lt;String&gt;
      would be entirely independent types at runtime, with wholly separate copies of the
      machine code.
   </p>
        <p>
      Certainly if I manually went about creating my own Int16List and StringList types,
      they would be distinct types with distinct machine code generated. Being a prudent
      developer, however, I'd probably try to arrange to share as much of the implementation
      as possible between the two types, perhaps using implementation inheritance. But alas,
      there's no way I could share it all: any code specific to Int16 or String, for example,
      would surely differ, both in MSIL and in the native code.
   </p>
        <p>
      Generics basically give you the ability to do this same thing, without you needing
      to do the factoring of type-independent and type-specific code yourself. The compiler
      does that for you.
   </p>
        <p>
      Why might the code be different? As stated above, Int16 values are 2 bytes and String
      pointers are native word sized (4 bytes on 32-bit, 8 bytes on 64-bit). All the code
      that passes values of type T on the stack, either as arguments or return values, moves
      instances into and out of memory locations (like the T[] backing array), and so on,
      needs to be specialized based on the size of T. This wouldn't be true of a generics
      implementation that used type erasure, like Java's, but then you'd need to box the
      value types on the heap so that everything is a pointer. If T is a Float, we will
      likely emit code that uses floating point math instead of general purpose registers.
      Any tables that report GC roots are likely to be different, since object references
      can be embedded inside struct values that get laid out on the stack. And so on. Some
      day you might want to compare the machine code for a simple generic Echo&lt;T&gt;
      method for different kinds of T's; it is really easy to do, and is quite illustrative.
   </p>
        <p>
      A naïve wish might go as follows. Imagine that I had written my own dedicated Int16List
      and StringList types, and that we diffed the resulting machine code between the distinct
      list types; we'd presumably find a fair bit of duplication for all the reasons stated
      above. It would be a nice property if, when we used the generic List&lt;Int16&gt;
      and List&lt;String&gt; types, and similarly diffed the resulting assembly, the amount
      of specialized code would be no greater than the amount of specialized code between
      our best hand-written Int16List and StringList types. I.e., only parts that need to
      be different are different.
   </p>
        <p>
      We could go even further with our wish. Imagine I had a List&lt;DateTime&gt; and List&lt;Int64&gt;.
      Both are 8-byte values, and do not contain any GC references. If I were writing a
      specialized 8ByteValueList in C++ and had immense performance constraints, I would,
      again being a prudent developer, probably use some type unsafe code, with nasty reinterpret_casts,
      so that I could use the same list type to store any kind of 8-byte value. (Except
      in C++ I could even store pointers!) It would also be a nice property if generics
      did some of this for us, while still retaining the type safety we love about generics.
   </p>
        <p>
      It turns out we will get neither of our wishes exactly, although we will get something
      close to the spirit of our wishes.
   </p>
        <h1>Code sharing
   </h1>
        <p>
      Indeed, the CLR does arrange to share many generic instantiations. The rule is simple,
      although it is subject to change in the future (being an optimization and all): instantiations
      over reference types are shared among all reference type instantiations for that generic
      type/method, whereas instantiations over value types get their own full copy of the
      code. In other words, List&lt;String&gt; and List&lt;Object&gt; are backed by the
      same code, but List&lt;DateTime&gt; and List&lt;Int64&gt; get their own.
   </p>
        <p>
      It is true that, in theory, List&lt;DateTime&gt; and List&lt;Int64&gt; could use the
      same shared code, because they are of identical size and have GC roots in the same
      locations (trivially, because neither has one). But there are additional restrictions
      on generated code that makes this problematic, for example if we were talking about
      Double and Int64. In short, the CLR doesn't actually share value type instantiations
      as of the 4.0 runtime, although clearly it could in certain situations (value types
      of the same size with GC roots in the same locations).
   </p>
        <p>
      As you might guess, this extends to multi-parameter generics in obvious ways. A Dictionary&lt;Object,
      Object&gt; is shared with a Dictionary&lt;String, String&gt;, etc., and a Dictionary&lt;Int64,
      Object&gt; is shared with a Dictionary&lt;Int64, String&gt;. A Dictionary&lt;DateTime,
      DateTime&gt; is not, however, shared with a Dictionary&lt;Int64, Int64&gt; instantiation,
      as per the above.
   </p>
        <p>
      My pal <a href="http://callvirt.net/blog/">Joel Pobar</a> wrote <a href="http://blogs.msdn.com/b/joelpob/archive/2004/11/17/259224.aspx">a
      post eons ago describing how code sharing works</a> in great detail, which I do not
      intend to rehash. Please refer to his post for an excellent overview of how code sharing
      works.
   </p>
        <p>
      An important thing to remember, however, is that no matter how much code sharing happens,
      you still need distinct RTTI data structures. So although List&lt;Object&gt; and List&lt;String&gt;
      share the same machine code, they have distinct vtables; sure, each table is full
      of pointers to the same code functions, but you are still paying for the runtime data
      structures. A distinct instantiation, therefore, is never actually free!
   </p>
        <h1>Transitive closures
   </h1>
        <p>
      Why am I making such a big deal about code sharing, anyway?
   </p>
        <p>
      Another surprising aspect of generics is the transitive closure problem. Particularly
      when doing pre-compilation of generics, each unique instantiation doesn't simply lead
      to a specialized version of the code associated with the type being directly instantiated.
      The whole transitive closure of types, starting with that root type, will also be
      compiled. This can be a surprisingly huge number of types! JIT is much more pay-for-play,
      such that you get one level of explosion at a time, but once there is code that calls
      a particular type's method, even if that code is lazily compiled, creation of the
      type is forced.
   </p>
        <p>
      To illustrate this, let's take our friend List&lt;T&gt;. Before examining the list,
      how many generic types would you expect that a single new List&lt;T&gt; instantiation
      instantiates?
   </p>
        <p>
      What if I told you that a single List&lt;int&gt; instantiation creates (at least)
      28 types? And that, say, five unique instantiations of List&lt;T&gt; might cost you
      300K of disk space and 70K of working set? Well, of course, if you are writing a script,
      or something with fairly loose performance requirements, this might not matter much.
      But if topics like download time, mobile footprint, and cache performance are important
      to you, then you probably want to pay attention to this. To a first approximation,
      size <u><i>is</i></u> speed.
   </p>
        <p>
      Yes, you heard me right: 28 types. Holy smokes... How can this be?!
   </p>
        <p>
      Nested types are one obvious answer, and indeed List&lt;T&gt; has two: an Enumerator
      class (which is reasonable), and one to support the legacy synchronized collections
      pattern (which we presumably wish we didn't have to pay for). The larger answer here,
      however, is functionality. Yes, functionality! This is a great example where the cost
      of generics explodes as you add more features. Start simple, keep adding stuff, as
      has happened to List&lt;T&gt; over the years, and you will soon find that a series
      of elegant abstractions adds up to a gut-wrenching bucket of bytes.
   </p>
        <p>
      Here's a quick sketch of the transitive closure of generic types used by List&lt;T&gt;:
   </p>
        <p>
        </p>
        <pre>
          <code>List&lt;T&gt; T[] type IList&lt;T&gt; type ICollection&lt;T&gt; type
   IEnumerable&lt;T&gt; type IEnumerator&lt;T&gt; type ReadOnlyCollection&lt;T&gt; type
   (AsReadOnly) (Nothing more than List&lt;T&gt;) IComparer&lt;T&gt; type (BinarySearch,
   Sort) {Array.BinarySearch&lt;T&gt; method (BinarySearch)} ArraySortHelper&lt;T&gt;
   type IArraySortHelper&lt;T&gt; type GenericArraySortHelper&lt;T&gt; type EqualityComparer&lt;T&gt;
   type (Contains) IEqualityComparer&lt;T&gt; type IEquatable&lt;T&gt; type NullableEqualityComparer&lt;T&gt;
   type Nullable&lt;T&gt; type EnumEqualityComparer&lt;T&gt; type {JitHelpers.UnsafeEnumCast&lt;T&gt;
   method} ObjectEqualityComparer&lt;T&gt; type Predicate&lt;T&gt; delegate type (Find*)
   Action&lt;T&gt; delegate type (ForEach) {Array.LastIndexOf&lt;T&gt; method (LastIndexOf)}
   Comparison&lt;T&gt; delegate type (Sort) Array.FunctorComparer&lt;T&gt; type (Sort)
   Comparer&lt;T&gt; type GenericComparer&lt;T&gt; type NullableComparer&lt;T&gt; type
   ObjectComparer&lt;T&gt; type {Array.Sort&lt;T&gt; method (Sort)} ArraySortHelper&lt;T&gt;
   type (see earlier) Enumerator inner type SynchronizedList&lt;T&gt; inner type IList&lt;T&gt;
   interface (see earlier) ICollection&lt;T&gt; interface (see earlier) IEnumerable&lt;T&gt;
   interface (see earlier) {Interlocked.CompareExchange&lt;Object&gt; method (SyncRoot)}
   {_emptyArray T[] static field} </code>
        </pre>
        <p>
      I'm not trying to pick on List&lt;T&gt;. This class is only unique in this regard
      in that it offers a large transitive closure of (mostly useful!) functionality. And
      it's not the only guilty party. We recently shaved off 100K's of code size on my team,
      for example, that were being lost simply because all the LINQ methods were declared
      as instance methods on the base collection class, rather than being extension methods
      as in .NET. We found nested enumerator and iterator types, cached static lambdas as
      static fields, and huge transitive closures of other generic types, all allocated
      when you just touched any collection type. Any collections library is apt to be full
      of this stuff, since they are highly generic. But collection libraries are certainly
      not the only places to go sniffing for such problems.
   </p>
        <p>
      As an aside, it turns out that extension methods are a great way to make generic abstractions
      more pay-for-play.
   </p>
        <h1>Adding it up
   </h1>
        <p>
      Let's see what the above adds up to. I ran some programs through NGen as a quick and
      dirty experiment, and inspected the on-disk sizes and also the runtime working set
      sizes. I ensured clrjit.dll was not loaded into the process. Here's what I found.
      Take these numbers with a grain of salt, as they will change from release to release;
      they are simply rules of thumb. When in doubt, crank up NGen, DumpBin, and/or start
      trawling the heap with VADump yourself!
   </p>
        <p>
      One empty type with no methods in CLR 4.0 seems to cost roughly 0.2K bytes of on-disk
      metadata, and about 0.7K in x64 working set. (This is a good rule of thumb irrespective
      of generics… in terms of order of magnitude, you can think "one empty type means 1K
      of memory.") A single List&lt;S&gt; instantiation, where S is an empty struct, is
      in the neighborhood of 60K on-disk metadata, and 14K of x64 working set. A single
      List&lt;C&gt; instantiation, however, where C is an empty class, is only – surprise
      – about 7K on-disk and 4K in-memory. Why the large discrepancy? Well, it just so happens
      that mscorlib.dll already includes an instantiation or two of List&lt;T&gt; over reference
      types, so this 4K is the incremental cost on top of reusing what is there; remember,
      there are still unique vtables and data structures still required for RTTI.
   </p>
        <p>
      Rico did <a href="http://blogs.msdn.com/b/ricom/archive/2005/08/26/performance-quiz-7-generics-improvements-and-costs-solution.aspx">a
      similar analysis a few years back</a>, and concluded that each unique List<E>
         , where E was an enum type, cost 8K. Why the increase to 14K over the years? x64 and
         ever-increasing functionality on the basic collections classes, presumably. Remember,
         it's not just List&lt;T&gt; that has grown, it's also everything that List&lt;T&gt;
         uses internally as an implementation detail.
      </E></p>
        <h1>Dynamic specialization with dictionaries
   </h1>
        <p>
      Some specialization in behavior can be accomplished with dynamic runtime behavior,
      rather than static code specialization. A prime example is the following:
   </p>
        <p>
        </p>
        <pre>
          <code>class C { public static void M&lt;T&gt;() { System.Console.WriteLine(typeof(T).Name);
   } } </code>
        </pre>
        <p>
      Where does the program get the value of typeof(T) from? If you look at the MSIL, you
      will see that C# has emitted a ldtoken MSIL instruction. For some struct type, we
      can compile that as a constant in the code, because it is getting its own copy of
      the code. What occurs when two instantiations share code, like M&lt;String&gt; and
      M&lt;Object&gt;, however? As you might guess, there is an indirection.
   </p>
        <p>
      The thing we usually use for such indirections – the vtable – is nowhere to be found
      in this particular example, because M is a static method. To deal with this, the compiler
      inserts an extra "hidden" argument, frequently called a generic dictionary, from which
      the emitted assembly code can fetch the type token. The cost here typically isn't
      bad, because many of the operations that pull in the dictionary are already RTTI or
      Reflection-based, and would require an indirection already (e.g., through a vtable).
   </p>
        <p>
      The operations which require a dictionary of some kind include anything that has to
      do with RTTI and yet no vtable is readily accessible: typeof, casts, is and as operators,
      etc. And as you might guess, if instantiations aren't shared (such as with value types
      on the CLR), no dictionary is needed, because the code is fully specialized. There
      are also multiple kinds of dictionaries used by the runtime, depending on whether
      you are using a generic type, method, or some combination of both.
   </p>
        <h1>JITting when you didn't mean to
   </h1>
        <p>
      There are two primary ways in which you will JIT compile when using generics, even
      if you were good doobie and used NGen to reduce startup time.
   </p>
        <p>
      One way is if you instantiate a new generic type exported from mscorlib.dll with a
      type argument also defined in mscorlib.dll, that wasn't already instantiated inside
      mscorlib.dll. (See my old <a href="http://www.bluebytesoftware.com/blog/2005/03/23/DGUpdateGenericsAndPerformance.aspx">Generics
      and Performance</a> blog entry for more details.) You can very easily see this happening
      by using an instantiation like Dictionary&lt;DateTime, DateTime&gt;, and watching
      the clrjit.dll module getting loaded.
   </p>
        <p>
      The other way is generic virtual methods (GVMs). It turns out that GVMs pose incredible
      difficulty for ahead of time separate compilation, because the compiler cannot know
      statically which slot in the vtable points at the particular implementation you are
      about to call. (Unless you use whole program compilation, something not offered by
      .NET at present time.) For each such method, there's an unbounded set of possible
      specialized instantiations a slot might point to, and so the vtable cannot be laid
      out in a traditional manner. C++ doesn't allow templated virtual methods for this
      very reason.
   </p>
        <p>
      Thankfully, GVMs are somewhat rare. However, it only took 5 minutes of poking around
      to find one that is quite front-and-center in .NET: in the implementation of LINQ,
      there is an Iterator&lt;T&gt; type that has a method declared as follows:
   </p>
        <p>
        </p>
        <pre>
          <code>public abstract IEnumerable&lt;TResult&gt; Select&lt;TResult&gt;(Func&lt;TSource,
   TResult&gt; selector); </code>
        </pre>
        <p>
      All we need to do is figure out how to tickle that method, and we're guaranteed to
      JIT. As it turns out, sure enough, the following code does the trick and forces clrjit.dll
      to get loaded in .NET 4.0:
   </p>
        <p>
        </p>
        <pre>
          <code>int[] xs = …; int[] ys = xs.Where(x =&gt; true).Select(x =&gt; x).ToArray(); </code>
        </pre>
        <p>
      The Iterator&lt;T&gt; type is used for back-to-back Where and Select operators, as
      a performance optimization that avoids excess allocations and interface dispatch.
      But because it depends on a GVM, it does incur an initial penalty for using it, even
      if you have used NGen to avoid runtime code generation.
   </p>
        <h1>In conclusion
   </h1>
        <p>
      The moral of the story here is <i>not</i> that you should fear generics. Beautiful
      things can be built with them.
   </p>
        <p>
      Instead, it's to use generics thoughtfully. Nothing in life is free, and generics
      are no exception to this rule. If code size is important to you, then you will want
      to have performance gates measuring your numbers against your goals; if you are working
      in a codebase that uses generics heavily, and you end up spending any significant
      time on code size optimizations, you will want to try to track down large transitive
      closures. As I stated above, you could really be throwing away 100K's of code here.
   </p>
        <p>
      And as to the surprise JITting, I've seen teams compiling with NGen and having a functional
      gate that fails any new code that causes clrjit.dll to get loaded at runtime. Although
      tracking down the root cause might be tricky when that gate fails, at least you won't
      let the camel's nose under the tent.
   </p>
        <p>
      Investing in tools here is a very good idea.
   </p>
        <p>
      When it comes down to it, really thinking about what code must be executed by the
      process is helpful. Step back and imagine you were writing this all in C++, with the
      associated performance concerns front-and-center: consider how you'd arrange to reuse
      as much implementation as you can, manage memory efficiently, perhaps employ unsafe
      tricks that would have violated type safety and so are offlimits in .NET, and all
      that jazz. Then step back and be grateful that you have a type- and memory-safe environment
      to help you write more robust code, but also be realistic about what you are paying
      in exchange.
   </p>
        <p>
      I hope you've learned a useful thing or two in this article. If you'd like to learn
      more, here are a few other good resources:
   </p>
        <ol>
          <li>
         An MSR paper on the original implementation of .NET generics: <a href="http://research.microsoft.com/pubs/64031/designandimplementationofgenerics.pdf">http://research.microsoft.com/pubs/64031/designandimplementationofgenerics.pdf</a></li>
          <li>
         Rico's "Six Questions About Generics and Performance" blog entry: <a href="http://blogs.msdn.com/b/ricom/archive/2004/09/13/229025.aspx">http://blogs.msdn.com/b/ricom/archive/2004/09/13/229025.aspx</a></li>
          <li>
         Joel Pobar's "Generics and Code Sharing" blog post: <a href="http://blogs.msdn.com/b/joelpob/archive/2004/11/17/259224.aspx">http://blogs.msdn.com/b/joelpob/archive/2004/11/17/259224.aspx</a></li>
          <li>
         My "Generics and Performance" blog entry: <a href="http://www.bluebytesoftware.com/blog/2005/03/23/DGUpdateGenericsAndPerformance.aspx">http://www.bluebytesoftware.com/blog/2005/03/23/DGUpdateGenericsAndPerformance.aspx</a></li>
        </ol>
        <p>
      Cheers.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=2d01b5eb-7f19-4a4c-a3b3-1cecb1e3dc75" />
      </body>
      <title>On generics and (some of) the associated overheads</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,2d01b5eb-7f19-4a4c-a3b3-1cecb1e3dc75.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2011/10/23/OnGenericsAndSomeOfTheAssociatedOverheads.aspx</link>
      <pubDate>Sun, 23 Oct 2011 22:26:02 GMT</pubDate>
      <description>&lt;p&gt;
   It's been unbelievably long since I last blogged.
&lt;/p&gt;
&lt;p&gt;
   The reason is simple. I've been ecstatic in my job and, every time I think to write
   something, I quickly end up turning to work and soon find that hours (days? months?)
   have passed. This is a wonderful problem to have, but not so good for keeping the
   blog looking fresh and new. (I've also been writing &lt;a href="http://soundcloud.com/45nm_joeduffy/sets/current/"&gt;a
   fair bit of music&lt;/a&gt; lately.) Well, this weekend I managed to lock myself out of
   my VPN access, and decided that this was a sign that I ought to dust off the cobwebs
   on a blog entry or two that I've had in the works for quite some time.
&lt;/p&gt;
&lt;p&gt;
   The topic for today is generics, a feature many of us know and love. Specifically,
   their impact on software performance, something I frequently see developers struggling
   to understand and tame in the wild. 
&lt;h1&gt;The blessing; and, the curse
&lt;/h1&gt;
&lt;p&gt;
   I absolutely love generics. I can hardly imagine writing code without them these days.
   The code reuse, higher-order expressiveness, beautiful abstractions, and static type-safety
   enabled by first class parametric polymorphism are all game-changing. And being a
   language history wonk, I'm delighted to see many mainstream programming languages
   stealing a page from ML and theoretical CS generally.
&lt;/p&gt;
&lt;p&gt;
   Generics, however, are not free. And in some circumstances, they are, dare I say,
   rather expensive. Few language features surpass generics in the ability to write a
   concise and elegant bit of code, which then translates into reams of ugly assembly
   code out the rear end of the compiler. I am of course speaking mainly to models in
   which compilation leads to code specialization (like .NET's), versus erasure (like
   Java's).
&lt;/p&gt;
&lt;p&gt;
   Most developers coming from a C++ background understand code expansion deeply, because
   they program with templates. Unlike templates, however, there is ample runtime type
   information (RTTI) associated with generic instantiations… such that the costs associated
   with generics frequently – and perhaps surprisingly – are a superset of those costs
   normally associated with C++ templates. At the same time, because the compiler understands
   parametric polymorphism, it can sometimes do a better job optimizing, e.g. with techniques
   like code sharing.
&lt;/p&gt;
&lt;p&gt;
   Basically, with templates and erasure, the equation for predicting code expansion
   is super simple. You get it all (in the former) or you get none of it (in the latter),
   but with specialized generics this equation is quite complex.
&lt;/p&gt;
&lt;p&gt;
   Paradoxically, these same costs are the main value that generics bring to the table!
   Write a little type-agnostic code and then "instantiate" that same code over multiple
   types without repeating yourself. But, generics are not magic; did you ever stop to
   wonder things like: What machine code is generated for these types? Does the compiler
   need to specialize the actual code that runs on the processor for unique instantiations,
   or is it all the same? And if it does need to specialize, where, how, and why? And
   perhaps most importantly, what hidden costs are there, and how should I think about
   them while writing code?
&lt;/p&gt;
&lt;p&gt;
   Before reading further, paranoia need not ensue. The point of this article is merely
   to raise your awareness. All programmers should know what the abstractions they use
   cost, and make conscious tradeoffs when writing code with them. The aforementioned
   benefits of generics really are often "worth it," both in the elegance and reusability
   of abstractions, and in developer productivity. In my experience, however, the associated
   costs are so subtle and ill-documented that even people who write highly generic code
   typically remain unaware of them. Even more subtly, these costs are somewhat different
   in nature when pre-compiling your code, such as with .NET's NGen technology.
&lt;/p&gt;
&lt;p&gt;
   This brief essay will walk through a few such costs in the context of the .NET Framework
   and CLR's implementation of generics. This is in no way an exhaustive study of generic
   compilation, and your mileage will vary from one platform to the next. Although the
   studies presented would apply to other implementations of generics, the reality is
   that if you're writing code in, say, Java – where type erasure is employed rather
   than code specialization – then all of this is going to be less relevant to you.
&lt;/p&gt;
&lt;p&gt;
   With no further delay, let's get started.
&lt;/p&gt;
&lt;h1&gt;Code, RTTI, oh my
&lt;/h1&gt;
&lt;p&gt;
   When considering costs, we must always think about both size and speed.
&lt;/p&gt;
&lt;p&gt;
   There is at least as much assembly code created for an instantiation as the code you've
   written for the generic abstraction in C# or MSIL. A simple mental model – that thankfully
   turns out isn't entirely accurate, thanks to some sharing optimizations described
   below – is that for each instantiation of a generic type or method you get a new copy
   of that code specialized to the type in question. Obviously, this increases code size.
   And just as obviously, it will add some runtime cost to JIT compile the code (if you
   aren't using ahead of time compilation), as well as putting more pressure on I-cache
   and TLB.
&lt;/p&gt;
&lt;p&gt;
   Another source of significant cost is the runtime data structures needed for RTTI
   and Reflection, like vtables and other metadata. Quite simply, the runtime needs to
   know the identity of each generic instantiation, to prevent things like casting a
   List&amp;lt;Int16&amp;gt; to a List&amp;lt;String&amp;gt;, and even List&amp;lt;Object&amp;gt; to List&amp;lt;String&amp;gt;;
   and given that there is often distinct code generated for unique instantiations, the
   vtable contents for those different List&amp;lt;T&amp;gt; instantiations are going to look
   quite different.
&lt;/p&gt;
&lt;p&gt;
   And of course, there are statics. Each generic instantiation gets its own set, requiring
   extra storage and another level of indirection when fetching them. Unique statics
   means D-cache and TLB pressure. It turns out that code shared across AppDomains, like
   mscorlib.dll, already need such things. But I have found that it's surprisingly common
   for a developer to throw a static field (or nested class!) onto an outer generic type,
   without actually needing it to be replicated for each unique instantiation.
&lt;/p&gt;
&lt;p&gt;
   In addition to the immediate effects, generic types often refer to other generic types
   which refer to other generic types … and so on. Instantiating a root type is akin
   to instantiating the full transitive closure.
&lt;/p&gt;
&lt;p&gt;
   To make our discussion friendly and familiar, we shall use the .NET Framework's List&amp;lt;T&amp;gt;
   type – presumably one of the most commonly used generic types on the planet – to illustrate
   many of these costs. And unfortunately, you'll also see that many of the common performance
   pitfalls plague this type too. (So, really, you need not feel bad if your own code
   is guilty of them too.)
&lt;/p&gt;
&lt;h1&gt;Why the distinct code, anyway?
&lt;/h1&gt;
&lt;p&gt;
   There is only one copy of List&amp;lt;T&amp;gt;'s code in mscorlib's MSIL. It is essentially
   just a blueprint for the list class.
&lt;/p&gt;
&lt;p&gt;
   When I create a List&amp;lt;Int16&amp;gt; in my program and use it, however, there clearly
   needs to be some assembly code created in order to execute List&amp;lt;T&amp;gt;'s associated
   functionality, just with any T's used by List&amp;lt;T&amp;gt;'s code replaced by actual 2-byte
   short integers. And similarly, if I were to instantiate a List&amp;lt;String&amp;gt;, all
   those T's need to be replaced by pointer-sized object references, either 4- or 8-bytes
   depending on machine architecture, that are reported live to the garbage collector.
&lt;/p&gt;
&lt;p&gt;
   This is what leads to our simple mental model above, in which each instantiation gets
   its own copy of the code. In this case, both List&amp;lt;Int16&amp;gt; and List&amp;lt;String&amp;gt;
   would be entirely independent types at runtime, with wholly separate copies of the
   machine code.
&lt;/p&gt;
&lt;p&gt;
   Certainly if I manually went about creating my own Int16List and StringList types,
   they would be distinct types with distinct machine code generated. Being a prudent
   developer, however, I'd probably try to arrange to share as much of the implementation
   as possible between the two types, perhaps using implementation inheritance. But alas,
   there's no way I could share it all: any code specific to Int16 or String, for example,
   would surely differ, both in MSIL and in the native code.
&lt;/p&gt;
&lt;p&gt;
   Generics basically give you the ability to do this same thing, without you needing
   to do the factoring of type-independent and type-specific code yourself. The compiler
   does that for you.
&lt;/p&gt;
&lt;p&gt;
   Why might the code be different? As stated above, Int16 values are 2 bytes and String
   pointers are native word sized (4 bytes on 32-bit, 8 bytes on 64-bit). All the code
   that passes values of type T on the stack, either as arguments or return values, moves
   instances into and out of memory locations (like the T[] backing array), and so on,
   needs to be specialized based on the size of T. This wouldn't be true of a generics
   implementation that used type erasure, like Java's, but then you'd need to box the
   value types on the heap so that everything is a pointer. If T is a Float, we will
   likely emit code that uses floating point math instead of general purpose registers.
   Any tables that report GC roots are likely to be different, since object references
   can be embedded inside struct values that get laid out on the stack. And so on. Some
   day you might want to compare the machine code for a simple generic Echo&amp;lt;T&amp;gt;
   method for different kinds of T's; it is really easy to do, and is quite illustrative.
&lt;/p&gt;
&lt;p&gt;
   A naïve wish might go as follows. Imagine that I had written my own dedicated Int16List
   and StringList types, and that we diffed the resulting machine code between the distinct
   list types; we'd presumably find a fair bit of duplication for all the reasons stated
   above. It would be a nice property if, when we used the generic List&amp;lt;Int16&amp;gt;
   and List&amp;lt;String&amp;gt; types, and similarly diffed the resulting assembly, the amount
   of specialized code would be no greater than the amount of specialized code between
   our best hand-written Int16List and StringList types. I.e., only parts that need to
   be different are different.
&lt;/p&gt;
&lt;p&gt;
   We could go even further with our wish. Imagine I had a List&amp;lt;DateTime&amp;gt; and List&amp;lt;Int64&amp;gt;.
   Both are 8-byte values, and do not contain any GC references. If I were writing a
   specialized 8ByteValueList in C++ and had immense performance constraints, I would,
   again being a prudent developer, probably use some type unsafe code, with nasty reinterpret_casts,
   so that I could use the same list type to store any kind of 8-byte value. (Except
   in C++ I could even store pointers!) It would also be a nice property if generics
   did some of this for us, while still retaining the type safety we love about generics.
&lt;/p&gt;
&lt;p&gt;
   It turns out we will get neither of our wishes exactly, although we will get something
   close to the spirit of our wishes.
&lt;/p&gt;
&lt;h1&gt;Code sharing
&lt;/h1&gt;
&lt;p&gt;
   Indeed, the CLR does arrange to share many generic instantiations. The rule is simple,
   although it is subject to change in the future (being an optimization and all): instantiations
   over reference types are shared among all reference type instantiations for that generic
   type/method, whereas instantiations over value types get their own full copy of the
   code. In other words, List&amp;lt;String&amp;gt; and List&amp;lt;Object&amp;gt; are backed by the
   same code, but List&amp;lt;DateTime&amp;gt; and List&amp;lt;Int64&amp;gt; get their own.
&lt;/p&gt;
&lt;p&gt;
   It is true that, in theory, List&amp;lt;DateTime&amp;gt; and List&amp;lt;Int64&amp;gt; could use the
   same shared code, because they are of identical size and have GC roots in the same
   locations (trivially, because neither has one). But there are additional restrictions
   on generated code that makes this problematic, for example if we were talking about
   Double and Int64. In short, the CLR doesn't actually share value type instantiations
   as of the 4.0 runtime, although clearly it could in certain situations (value types
   of the same size with GC roots in the same locations).
&lt;/p&gt;
&lt;p&gt;
   As you might guess, this extends to multi-parameter generics in obvious ways. A Dictionary&amp;lt;Object,
   Object&amp;gt; is shared with a Dictionary&amp;lt;String, String&amp;gt;, etc., and a Dictionary&amp;lt;Int64,
   Object&amp;gt; is shared with a Dictionary&amp;lt;Int64, String&amp;gt;. A Dictionary&amp;lt;DateTime,
   DateTime&amp;gt; is not, however, shared with a Dictionary&amp;lt;Int64, Int64&amp;gt; instantiation,
   as per the above.
&lt;/p&gt;
&lt;p&gt;
   My pal &lt;a href="http://callvirt.net/blog/"&gt;Joel Pobar&lt;/a&gt; wrote &lt;a href="http://blogs.msdn.com/b/joelpob/archive/2004/11/17/259224.aspx"&gt;a
   post eons ago describing how code sharing works&lt;/a&gt; in great detail, which I do not
   intend to rehash. Please refer to his post for an excellent overview of how code sharing
   works.
&lt;/p&gt;
&lt;p&gt;
   An important thing to remember, however, is that no matter how much code sharing happens,
   you still need distinct RTTI data structures. So although List&amp;lt;Object&amp;gt; and List&amp;lt;String&amp;gt;
   share the same machine code, they have distinct vtables; sure, each table is full
   of pointers to the same code functions, but you are still paying for the runtime data
   structures. A distinct instantiation, therefore, is never actually free!
&lt;/p&gt;
&lt;h1&gt;Transitive closures
&lt;/h1&gt;
&lt;p&gt;
   Why am I making such a big deal about code sharing, anyway?
&lt;/p&gt;
&lt;p&gt;
   Another surprising aspect of generics is the transitive closure problem. Particularly
   when doing pre-compilation of generics, each unique instantiation doesn't simply lead
   to a specialized version of the code associated with the type being directly instantiated.
   The whole transitive closure of types, starting with that root type, will also be
   compiled. This can be a surprisingly huge number of types! JIT is much more pay-for-play,
   such that you get one level of explosion at a time, but once there is code that calls
   a particular type's method, even if that code is lazily compiled, creation of the
   type is forced.
&lt;/p&gt;
&lt;p&gt;
   To illustrate this, let's take our friend List&amp;lt;T&amp;gt;. Before examining the list,
   how many generic types would you expect that a single new List&amp;lt;T&amp;gt; instantiation
   instantiates?
&lt;/p&gt;
&lt;p&gt;
   What if I told you that a single List&amp;lt;int&amp;gt; instantiation creates (at least)
   28 types? And that, say, five unique instantiations of List&amp;lt;T&amp;gt; might cost you
   300K of disk space and 70K of working set? Well, of course, if you are writing a script,
   or something with fairly loose performance requirements, this might not matter much.
   But if topics like download time, mobile footprint, and cache performance are important
   to you, then you probably want to pay attention to this. To a first approximation,
   size &lt;u&gt;&lt;i&gt;is&lt;/i&gt;&lt;/u&gt; speed.
&lt;/p&gt;
&lt;p&gt;
   Yes, you heard me right: 28 types. Holy smokes... How can this be?!
&lt;/p&gt;
&lt;p&gt;
   Nested types are one obvious answer, and indeed List&amp;lt;T&amp;gt; has two: an Enumerator
   class (which is reasonable), and one to support the legacy synchronized collections
   pattern (which we presumably wish we didn't have to pay for). The larger answer here,
   however, is functionality. Yes, functionality! This is a great example where the cost
   of generics explodes as you add more features. Start simple, keep adding stuff, as
   has happened to List&amp;lt;T&amp;gt; over the years, and you will soon find that a series
   of elegant abstractions adds up to a gut-wrenching bucket of bytes.
&lt;/p&gt;
&lt;p&gt;
   Here's a quick sketch of the transitive closure of generic types used by List&amp;lt;T&amp;gt;:
&lt;/p&gt;
&lt;p&gt;
&lt;pre&gt;&lt;code&gt;List&amp;lt;T&amp;gt; T[] type IList&amp;lt;T&amp;gt; type ICollection&amp;lt;T&amp;gt; type IEnumerable&amp;lt;T&amp;gt;
type IEnumerator&amp;lt;T&amp;gt; type ReadOnlyCollection&amp;lt;T&amp;gt; type (AsReadOnly) (Nothing
more than List&amp;lt;T&amp;gt;) IComparer&amp;lt;T&amp;gt; type (BinarySearch, Sort) {Array.BinarySearch&amp;lt;T&amp;gt;
method (BinarySearch)} ArraySortHelper&amp;lt;T&amp;gt; type IArraySortHelper&amp;lt;T&amp;gt; type
GenericArraySortHelper&amp;lt;T&amp;gt; type EqualityComparer&amp;lt;T&amp;gt; type (Contains) IEqualityComparer&amp;lt;T&amp;gt;
type IEquatable&amp;lt;T&amp;gt; type NullableEqualityComparer&amp;lt;T&amp;gt; type Nullable&amp;lt;T&amp;gt;
type EnumEqualityComparer&amp;lt;T&amp;gt; type {JitHelpers.UnsafeEnumCast&amp;lt;T&amp;gt; method}
ObjectEqualityComparer&amp;lt;T&amp;gt; type Predicate&amp;lt;T&amp;gt; delegate type (Find*) Action&amp;lt;T&amp;gt;
delegate type (ForEach) {Array.LastIndexOf&amp;lt;T&amp;gt; method (LastIndexOf)} Comparison&amp;lt;T&amp;gt;
delegate type (Sort) Array.FunctorComparer&amp;lt;T&amp;gt; type (Sort) Comparer&amp;lt;T&amp;gt;
type GenericComparer&amp;lt;T&amp;gt; type NullableComparer&amp;lt;T&amp;gt; type ObjectComparer&amp;lt;T&amp;gt;
type {Array.Sort&amp;lt;T&amp;gt; method (Sort)} ArraySortHelper&amp;lt;T&amp;gt; type (see earlier)
Enumerator inner type SynchronizedList&amp;lt;T&amp;gt; inner type IList&amp;lt;T&amp;gt; interface
(see earlier) ICollection&amp;lt;T&amp;gt; interface (see earlier) IEnumerable&amp;lt;T&amp;gt; interface
(see earlier) {Interlocked.CompareExchange&amp;lt;Object&amp;gt; method (SyncRoot)} {_emptyArray
T[] static field} &lt;/code&gt;&lt;/pre&gt;
&gt;
&lt;p&gt;
   I'm not trying to pick on List&amp;lt;T&amp;gt;. This class is only unique in this regard
   in that it offers a large transitive closure of (mostly useful!) functionality. And
   it's not the only guilty party. We recently shaved off 100K's of code size on my team,
   for example, that were being lost simply because all the LINQ methods were declared
   as instance methods on the base collection class, rather than being extension methods
   as in .NET. We found nested enumerator and iterator types, cached static lambdas as
   static fields, and huge transitive closures of other generic types, all allocated
   when you just touched any collection type. Any collections library is apt to be full
   of this stuff, since they are highly generic. But collection libraries are certainly
   not the only places to go sniffing for such problems.
&lt;/p&gt;
&lt;p&gt;
   As an aside, it turns out that extension methods are a great way to make generic abstractions
   more pay-for-play.
&lt;/p&gt;
&lt;h1&gt;Adding it up
&lt;/h1&gt;
&lt;p&gt;
   Let's see what the above adds up to. I ran some programs through NGen as a quick and
   dirty experiment, and inspected the on-disk sizes and also the runtime working set
   sizes. I ensured clrjit.dll was not loaded into the process. Here's what I found.
   Take these numbers with a grain of salt, as they will change from release to release;
   they are simply rules of thumb. When in doubt, crank up NGen, DumpBin, and/or start
   trawling the heap with VADump yourself!
&lt;/p&gt;
&lt;p&gt;
   One empty type with no methods in CLR 4.0 seems to cost roughly 0.2K bytes of on-disk
   metadata, and about 0.7K in x64 working set. (This is a good rule of thumb irrespective
   of generics… in terms of order of magnitude, you can think "one empty type means 1K
   of memory.") A single List&amp;lt;S&amp;gt; instantiation, where S is an empty struct, is
   in the neighborhood of 60K on-disk metadata, and 14K of x64 working set. A single
   List&amp;lt;C&amp;gt; instantiation, however, where C is an empty class, is only – surprise
   – about 7K on-disk and 4K in-memory. Why the large discrepancy? Well, it just so happens
   that mscorlib.dll already includes an instantiation or two of List&amp;lt;T&amp;gt; over reference
   types, so this 4K is the incremental cost on top of reusing what is there; remember,
   there are still unique vtables and data structures still required for RTTI.
&lt;/p&gt;
&lt;p&gt;
   Rico did &lt;a href="http://blogs.msdn.com/b/ricom/archive/2005/08/26/performance-quiz-7-generics-improvements-and-costs-solution.aspx"&gt;a
   similar analysis a few years back&lt;/a&gt;, and concluded that each unique List&lt;E&gt;
      , where E was an enum type, cost 8K. Why the increase to 14K over the years? x64 and
      ever-increasing functionality on the basic collections classes, presumably. Remember,
      it's not just List&amp;lt;T&amp;gt; that has grown, it's also everything that List&amp;lt;T&amp;gt;
      uses internally as an implementation detail.
&lt;/p&gt;
&lt;h1&gt;Dynamic specialization with dictionaries
&lt;/h1&gt;
&lt;p&gt;
   Some specialization in behavior can be accomplished with dynamic runtime behavior,
   rather than static code specialization. A prime example is the following:
&lt;/p&gt;
&lt;p&gt;
&lt;pre&gt;&lt;code&gt;class C { public static void M&amp;lt;T&amp;gt;() { System.Console.WriteLine(typeof(T).Name);
} } &lt;/code&gt;&lt;/pre&gt;
&gt;
&lt;p&gt;
   Where does the program get the value of typeof(T) from? If you look at the MSIL, you
   will see that C# has emitted a ldtoken MSIL instruction. For some struct type, we
   can compile that as a constant in the code, because it is getting its own copy of
   the code. What occurs when two instantiations share code, like M&amp;lt;String&amp;gt; and
   M&amp;lt;Object&amp;gt;, however? As you might guess, there is an indirection.
&lt;/p&gt;
&lt;p&gt;
   The thing we usually use for such indirections – the vtable – is nowhere to be found
   in this particular example, because M is a static method. To deal with this, the compiler
   inserts an extra "hidden" argument, frequently called a generic dictionary, from which
   the emitted assembly code can fetch the type token. The cost here typically isn't
   bad, because many of the operations that pull in the dictionary are already RTTI or
   Reflection-based, and would require an indirection already (e.g., through a vtable).
&lt;/p&gt;
&lt;p&gt;
   The operations which require a dictionary of some kind include anything that has to
   do with RTTI and yet no vtable is readily accessible: typeof, casts, is and as operators,
   etc. And as you might guess, if instantiations aren't shared (such as with value types
   on the CLR), no dictionary is needed, because the code is fully specialized. There
   are also multiple kinds of dictionaries used by the runtime, depending on whether
   you are using a generic type, method, or some combination of both.
&lt;/p&gt;
&lt;h1&gt;JITting when you didn't mean to
&lt;/h1&gt;
&lt;p&gt;
   There are two primary ways in which you will JIT compile when using generics, even
   if you were good doobie and used NGen to reduce startup time.
&lt;/p&gt;
&lt;p&gt;
   One way is if you instantiate a new generic type exported from mscorlib.dll with a
   type argument also defined in mscorlib.dll, that wasn't already instantiated inside
   mscorlib.dll. (See my old &lt;a href="http://www.bluebytesoftware.com/blog/2005/03/23/DGUpdateGenericsAndPerformance.aspx"&gt;Generics
   and Performance&lt;/a&gt; blog entry for more details.) You can very easily see this happening
   by using an instantiation like Dictionary&amp;lt;DateTime, DateTime&amp;gt;, and watching
   the clrjit.dll module getting loaded.
&lt;/p&gt;
&lt;p&gt;
   The other way is generic virtual methods (GVMs). It turns out that GVMs pose incredible
   difficulty for ahead of time separate compilation, because the compiler cannot know
   statically which slot in the vtable points at the particular implementation you are
   about to call. (Unless you use whole program compilation, something not offered by
   .NET at present time.) For each such method, there's an unbounded set of possible
   specialized instantiations a slot might point to, and so the vtable cannot be laid
   out in a traditional manner. C++ doesn't allow templated virtual methods for this
   very reason.
&lt;/p&gt;
&lt;p&gt;
   Thankfully, GVMs are somewhat rare. However, it only took 5 minutes of poking around
   to find one that is quite front-and-center in .NET: in the implementation of LINQ,
   there is an Iterator&amp;lt;T&amp;gt; type that has a method declared as follows:
&lt;/p&gt;
&lt;p&gt;
&lt;pre&gt;&lt;code&gt;public abstract IEnumerable&amp;lt;TResult&amp;gt; Select&amp;lt;TResult&amp;gt;(Func&amp;lt;TSource,
TResult&amp;gt; selector); &lt;/code&gt;&lt;/pre&gt;
&gt;
&lt;p&gt;
   All we need to do is figure out how to tickle that method, and we're guaranteed to
   JIT. As it turns out, sure enough, the following code does the trick and forces clrjit.dll
   to get loaded in .NET 4.0:
&lt;/p&gt;
&lt;p&gt;
&lt;pre&gt;&lt;code&gt;int[] xs = …; int[] ys = xs.Where(x =&gt; true).Select(x =&gt; x).ToArray(); &lt;/code&gt;&lt;/pre&gt;
&gt;
&lt;p&gt;
   The Iterator&amp;lt;T&amp;gt; type is used for back-to-back Where and Select operators, as
   a performance optimization that avoids excess allocations and interface dispatch.
   But because it depends on a GVM, it does incur an initial penalty for using it, even
   if you have used NGen to avoid runtime code generation.
&lt;/p&gt;
&lt;h1&gt;In conclusion
&lt;/h1&gt;
&lt;p&gt;
   The moral of the story here is &lt;i&gt;not&lt;/i&gt; that you should fear generics. Beautiful
   things can be built with them.
&lt;/p&gt;
&lt;p&gt;
   Instead, it's to use generics thoughtfully. Nothing in life is free, and generics
   are no exception to this rule. If code size is important to you, then you will want
   to have performance gates measuring your numbers against your goals; if you are working
   in a codebase that uses generics heavily, and you end up spending any significant
   time on code size optimizations, you will want to try to track down large transitive
   closures. As I stated above, you could really be throwing away 100K's of code here.
&lt;/p&gt;
&lt;p&gt;
   And as to the surprise JITting, I've seen teams compiling with NGen and having a functional
   gate that fails any new code that causes clrjit.dll to get loaded at runtime. Although
   tracking down the root cause might be tricky when that gate fails, at least you won't
   let the camel's nose under the tent.
&lt;/p&gt;
&lt;p&gt;
   Investing in tools here is a very good idea.
&lt;/p&gt;
&lt;p&gt;
   When it comes down to it, really thinking about what code must be executed by the
   process is helpful. Step back and imagine you were writing this all in C++, with the
   associated performance concerns front-and-center: consider how you'd arrange to reuse
   as much implementation as you can, manage memory efficiently, perhaps employ unsafe
   tricks that would have violated type safety and so are offlimits in .NET, and all
   that jazz. Then step back and be grateful that you have a type- and memory-safe environment
   to help you write more robust code, but also be realistic about what you are paying
   in exchange.
&lt;/p&gt;
&lt;p&gt;
   I hope you've learned a useful thing or two in this article. If you'd like to learn
   more, here are a few other good resources:
&lt;/p&gt;
&lt;ol&gt;
   &lt;li&gt;
      An MSR paper on the original implementation of .NET generics: &lt;a href="http://research.microsoft.com/pubs/64031/designandimplementationofgenerics.pdf"&gt;http://research.microsoft.com/pubs/64031/designandimplementationofgenerics.pdf&lt;/a&gt; 
   &lt;li&gt;
      Rico's "Six Questions About Generics and Performance" blog entry: &lt;a href="http://blogs.msdn.com/b/ricom/archive/2004/09/13/229025.aspx"&gt;http://blogs.msdn.com/b/ricom/archive/2004/09/13/229025.aspx&lt;/a&gt; 
   &lt;li&gt;
      Joel Pobar's "Generics and Code Sharing" blog post: &lt;a href="http://blogs.msdn.com/b/joelpob/archive/2004/11/17/259224.aspx"&gt;http://blogs.msdn.com/b/joelpob/archive/2004/11/17/259224.aspx&lt;/a&gt; 
   &lt;li&gt;
      My "Generics and Performance" blog entry: &lt;a href="http://www.bluebytesoftware.com/blog/2005/03/23/DGUpdateGenericsAndPerformance.aspx"&gt;http://www.bluebytesoftware.com/blog/2005/03/23/DGUpdateGenericsAndPerformance.aspx&lt;/a&gt; 
&lt;/ol&gt;
&lt;p&gt;
   Cheers.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=2d01b5eb-7f19-4a4c-a3b3-1cecb1e3dc75" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=d59f5b8a-0373-4143-8f98-999e044a4ad0</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,d59f5b8a-0373-4143-8f98-999e044a4ad0.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      InfoQ recently asked me a few questions about concurrency and programming languages,
      and here is what I had to say:
   </p>
        <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
          <p>
            <a href="http://www.infoq.com/articles/Interview-Joe-Duffy">http://www.infoq.com/articles/Interview-Joe-Duffy</a>
          </p>
        </blockquote>
        <p>
      A little teaser:
   </p>
        <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
          <p>
            <em>"The major shift we face will be that mainstream languages will start to incorporate
      more concurrency-safety -- immutability and isolation -- and the platform libraries
      and architectures will better support this style of software decomposition. OOP developers
      are accustomed to partitioning their program into classes and objects; now they will
      need to become accustomed to partitioning their program into asynchronous Actors that
      can run concurrently. Within this sea of asynchrony will lay ordinary imperative code,
      frequently augmented with fine-grained task and data parallelism."</em>
          </p>
        </blockquote>
        <p>
      As an aside, I know I've been super quiet lately.  I never thought I'd go months
      without blogging.  My sincere apologies for this; work has been too all-consuming
      / fun, and I've been unable to carve out much time for anything else.  (Speaking
      of which, we are still hiring: email me at joedu at you-know-where dot com if you
      are interested.)  I'm about to head out to Europe for a few weeks, where hopefully
      I'll have a bit of time to write up something exciting to share.  Cheers.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=d59f5b8a-0373-4143-8f98-999e044a4ad0" />
      </body>
      <title>An interview with InfoQ</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,d59f5b8a-0373-4143-8f98-999e044a4ad0.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2011/06/01/AnInterviewWithInfoQ.aspx</link>
      <pubDate>Wed, 01 Jun 2011 17:57:44 GMT</pubDate>
      <description>&lt;p&gt;
   InfoQ recently asked me a few questions about concurrency and programming languages,
   and here is what I had to say:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt; 
&lt;p&gt;
   &lt;a href="http://www.infoq.com/articles/Interview-Joe-Duffy"&gt;http://www.infoq.com/articles/Interview-Joe-Duffy&lt;/a&gt;
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
   A little teaser:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt; 
&lt;p&gt;
   &lt;em&gt;"The major shift we face will be that mainstream languages will start to incorporate
   more concurrency-safety -- immutability and isolation -- and the platform libraries
   and architectures will better support this style of software decomposition. OOP developers
   are accustomed to partitioning their program into classes and objects; now they will
   need to become accustomed to partitioning their program into asynchronous Actors that
   can run concurrently. Within this sea of asynchrony will lay ordinary imperative code,
   frequently augmented with fine-grained task and data parallelism."&lt;/em&gt;
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
   As an aside, I know I've been super quiet lately.&amp;nbsp; I never thought I'd go months
   without blogging.&amp;nbsp; My sincere apologies for this; work has been too all-consuming
   / fun, and I've been unable to carve out much time for anything else.&amp;nbsp; (Speaking
   of which, we are still hiring: email me at joedu at you-know-where dot com if you
   are interested.)&amp;nbsp; I'm about to head out to Europe for a few weeks, where hopefully
   I'll have a bit of time to write up something exciting to share.&amp;nbsp; Cheers.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=d59f5b8a-0373-4143-8f98-999e044a4ad0" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=a0484627-c752-45f6-a2ac-414130cb3d2f</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,a0484627-c752-45f6-a2ac-414130cb3d2f.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      After spending more time than I’d like to admit over the years researching memory
      model literature (particularly Java’s terrific JMM and related publications), subsequently
      trying to “fix” the CLR memory model, reviewing proposals for new C++ memory models,
      and beating my head against the wall for months developing a new memory model that
      supports weakly ordered processors like the kind you’ll find on ARM in a developer-friendly
      yet power-efficient way, I have a conclusion to share.
   </p>
        <h1>Volatile is evil
   </h1>
        <p>
      Why? Let me recount the reasons:
   </p>
        <ol>
          <li>
         It doesn’t mean what you think. 
      </li>
          <li>
         It used to have a very specific purpose — to enure memory operations with external
         side-effects did not get reordered — and has since gotten bastardized and used for
         many secondarily-related purposes. 
      </li>
          <li>
         Even if you think it does mean what you think, the annotation scheme is all wrong.
         Volatile annotates a storage location, and yet what really matters is what happens
         when accessing said storage locations. The fences occur when you access the variable,
         not when you declare it. And yet from a readability perspective, they are completely
         invisible and easy to miss. 
      </li>
          <li>
         And even if you don’t care about readability, the meaning of volatile changes wildly
         when you switch platforms. Today it’s store / release, tomorrow it’s write / read
         fences. Perhaps it’s even sequentially consistent. And the label of “store / release”
         could actually be a white lie, as with the CLR’s memory model thanks to store buffer
         forwarding and the lack of fences in the CLR JIT’s x64 stores. 
      </li>
          <li>
         Performance, man, performance! Sure sequential consistency as the default sounds nice
         on the tin, but once you’re running that mobile app on ARM, and sucking up 160 cycles
         for each write you perform, you’re going to curse volatile like the plague. 
      </li>
        </ol>
        <p>
      And so the moral of the story follows...
   </p>
        <h1>Attempting to “fix” volatile is a waste of time
   </h1>
        <p>
      Instead, a new world order has arrived. We must take a two-pronged approach to solving
      instruction-level interleaving bugs, neither prong having much to do with the traditional
      definition of memory models or volatiles. We must:
   </p>
        <ol>
          <li>
         Eliminate memory ordering from 99% of developers’ purviews. This is already the case
         with single-threaded programs, because code motion in compilers and processors is
         limited to what only affects concurrent observations. So the answer is pretty clear:
         developers must move towards single-threaded programming models connected through
         message passing, optionally with provably race-free fine-grained parallelism inside
         of those single-threaded worlds. 
      </li>
          <li>
         Leave the memory model esoterica to the Einsteins, and radically change its meaning.
         Data dependence and transitive visibility of memory operations are in. Volatiles on
         storage locations are out. Instead, we must throw fences into programmer’s faces,
         and force them to understand each and every one that occurs. And moreover, force them
         to decide about each and every one that occurs. Specifically, hidden fences thanks
         to volatile are no longer. Those who cannot take it should fall into the 99% bucket
         already mentioned above, versus the 1% bucket. 
      </li>
        </ol>
        <p>
      Let’s set #1 aside for now, since it’s obviously a huge can of worms.
   </p>
        <p>
      But what about #2? It is quite encouraging that the C++0x group is firmly on the path
      of #2. See <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2145.html">http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2145.html</a> for
      more details. In a nutshell, each location that you’d have ordinarily tagged volatile
      instead becomes a template atomic<T>
         type. And then each read / write has the opportunity to specify the desired kind of
         fence, whether that is acquire (for reads), release (for writes), fully ordered, or
         relaxed (meaning no fence).
      </T></p>
        <p>
      I do think it’d be worth them considering compiler-only fences too. So that relaxed
      means fully-relaxed, and there is a fence in-between that merely prevents the compiler
      from optimizing the memory operation. This pays homage to volatile’s legacy in C as
      merely a variable that mustn’t be subject to optimizations, because operations against
      these variables pertain to, say, memory-mapped device I/O.
   </p>
        <p>
      Another nitpick of mine is that I’d have required each access to specify the fence,
      whereas C++0x implicitly uses full fence if left unspecified at the callsite. It’s
      a minor convenience, but I like always having the fence spelled out very explicitly
      in the code. Lock-free access to shared variables is sufficiently dangerous that automatic
      sequential consistency is the least of your problems.
   </p>
        <p>
      Nevertheless, the C++0x direction is a massive good step forward, and these are just
      minor details.
   </p>
        <p>
      My hope is that .NET follows suit. And the timing couldn’t be more apropos as “now”:
      we are moving forward in a heavily mobile, distributed-system-on-a-chip, and heterogeneous
      world, where processor memory models will necessarily continue to weaken. The overly
      strong x86 memory model, kept alive primarily to ensure compatibility, has simply
      grown too expensive to accommodate. The power benefits and architectural simplifications
      are hard to argue with, and because compatibility becomes less of an issue as new
      platforms arise (e.g. for mobile), the world moves to the cloud, and hence there is
      legacy to worry about, I do hope that processor vendors seize the opportunity. ARM
      certainly has. It is less about out-of-order execution as it is about coherency costs.
      Truthfully, I’d be disappointed if anything else happened, even though the risk to
      compatibility for shrink-wrapped software scares the hell out of me. But this is most
      certainly the right way to go, long-term. As software platforms move in the direction
      of #1 – as I also foresee – the need for fences dwindles. The cost of supporting the
      current .NET memory model is too great and will become a liability with time.
   </p>
        <p>
      Thankfully, it is quite simple to build a veneer atop .NET that works a lot more like
      atomic<T>
         . For example, imagine that we had a new System.Threading.Volatile static class, and
         that it offered the moral equivalent to atomic<T>
            inner types for each atomic primitive we can synchronize against:
         </T></T></p>
        <p>
        </p>
        <pre>
          <code>namespace System.Threading { public static class Volatile { public
   struct Int32 {..} public struct Int64 {..} … public struct Reference<T>
      where T : class {..} … } }
   </T></code>
        </pre>
        <p>
      Now instead of tagging a location as ‘volatile’, you would use one of these primitives.
      For example, rather than:
   </p>
        <p>
        </p>
        <pre>
          <code>static volatile MySingleton s_instance;</code>
        </pre>
        <p>
      You would say:
   </p>
        <p>
        </p>
        <pre>
          <code>static Volatile.Reference&lt;MySingleton&gt; s_instance;</code>
        </pre>
        <p>
      Each class has a similar set of operations. For example:
   </p>
        <p>
        </p>
        <pre>
          <code>namespace System.Threading { public static class Volatile { public
   struct Int32 { public Int32(int value); public int ReadAcquireFence(); public int
   ReadFullFence(); public int ReadCompilerOnlyFence(); public int ReadUnfenced(); public
   void WriteReleaseFence(int newValue); public void WriteFullFence(int newValue); public
   void WriteCompilerOnlyFence(int newValue); public void WriteUnfenced(int newValue);
   public int AtomicCompareExchange(int newValue, int comparand); public int AtomicExchange(int
   newValue); public int AtomicAdd(int delta); public int AtomicIncrement(); public int
   AtomicDecrement(); // Etc…, bitwise ops, other math ops, etc. } } }</code>
        </pre>
        <p>
      Of course, only the integer types would offer the increment, decrement, add, and related
      operators. And it turns out that offering different kinds of fences on the Atomic*
      operations would be incredibly useful too, because processors like ARM do not couple
      the fence to the compare-and-swap / load-locked-store-conditional as x86 processors
      do. Taking advantage of this can be huge if you are writing performance critical code,
      like a concurrent garbage collector whose atomic swaps need not imply ordering with
      the surrounding instruction stream. You can quibble over the details, like whether
      these should use enums instead of the name to encode the fence-kind. I did it this
      way to keep the implementations branch-free, although with a decent inlining JIT compiler,
      it’d probably optimize those away thanks to constant propagation.
   </p>
        <p>
      It’s quite trivial to implement these APIs atop existing .NET primitives. I built
      a little library that does so, but it was so boring and repetitive I decided not to
      post it alongside this blog entry as originally intended.
   </p>
        <p>
      With the above definition, we can very clearly see the fences involved in doing, say,
      double-checked locking:
   </p>
        <p>
        </p>
        <pre>
          <code>static Volatile.Reference&lt;MySingleton&gt; s_instance; public static
   MySingleton Instance { get { MySingleton instance = s_instance.ReadAcquireFence();
   if (instance == null) { instance = new MySingleton(); instance = s_instance.AtomicCompareExchangeRelease(instance,
   null); } return instance; } }</code>
        </pre>
        <p>
      We see there are two fences. One is an acquire and, depending on what your memory
      model says about data dependence, is probably unnecessary. Most sane memory models
      guarantee that data dependent loads do not pass. So we needn’t worry that we’ll see
      a non-null s_instance whose contents haven’t been initialized. (If we were talking
      structs, it’d be another story.) Nevertheless, it’s definitely required that we use
      a release-only fence for the publication of the object. This guarantees writes to
      fields within the MySingleton constructor have completed prior to the write of the
      new object to the shared instance field. The point here is that you are forced to
      think about the fences, and you actually see them.
   </p>
        <p>
      Of course, most platforms need to provide the bare minimum of fencing to assure type
      safety, particularly for languages like C#. My understanding is that C++0x has decided,
      at least for now, not to offer type-safety in the face of multithreading. That means
      you might publish an object and, if stores occur out-of-order, the reader could see
      an object partially initialized with an invalid vtable pointer. In C# and Java, the
      language designers have thankfully decided to shield programmers from this. The need
      for fences also extends to unsafe code like strings, where – were it possible for
      a thread to read the non-zero length before the char* pointer was valid – writes to
      random memory could occur and hence threaten type-safety. Thankfully, again, C# and
      Java protect developers from this, mostly due to the automatic zero’ing of memory
      as the GC allocates and hands out new segments.
   </p>
        <p>
      There are costs to offering this type safety assurance. So you can understand why
      the C++ designers want to keep fences out of object allocation. If you have #1 above,
      however, the costs are dramatically lower and more acceptable. But the world is –
      unfortunately – still a freethreaded one, and we have several years to go before we’ve
      reached the final destination. As a step forward, however, the death of volatile is
      a welcomed one. Say it with me.
   </p>
        <h1>“Sayonara volatile.”
   </h1>
        <p>
      Here’s hoping that .NET 5.0 takes this step forward too.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=a0484627-c752-45f6-a2ac-414130cb3d2f" />
      </body>
      <title>Sayonara volatile</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,a0484627-c752-45f6-a2ac-414130cb3d2f.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2010/12/04/SayonaraVolatile.aspx</link>
      <pubDate>Sat, 04 Dec 2010 23:16:34 GMT</pubDate>
      <description>&lt;p&gt;
   After spending more time than I’d like to admit over the years researching memory
   model literature (particularly Java’s terrific JMM and related publications), subsequently
   trying to “fix” the CLR memory model, reviewing proposals for new C++ memory models,
   and beating my head against the wall for months developing a new memory model that
   supports weakly ordered processors like the kind you’ll find on ARM in a developer-friendly
   yet power-efficient way, I have a conclusion to share.
&lt;/p&gt;
&lt;h1&gt;Volatile is evil
&lt;/h1&gt;
&lt;p&gt;
   Why? Let me recount the reasons:
&lt;/p&gt;
&lt;ol&gt;
   &lt;li&gt;
      It doesn’t mean what you think. 
   &lt;li&gt;
      It used to have a very specific purpose — to enure memory operations with external
      side-effects did not get reordered — and has since gotten bastardized and used for
      many secondarily-related purposes. 
   &lt;li&gt;
      Even if you think it does mean what you think, the annotation scheme is all wrong.
      Volatile annotates a storage location, and yet what really matters is what happens
      when accessing said storage locations. The fences occur when you access the variable,
      not when you declare it. And yet from a readability perspective, they are completely
      invisible and easy to miss. 
   &lt;li&gt;
      And even if you don’t care about readability, the meaning of volatile changes wildly
      when you switch platforms. Today it’s store / release, tomorrow it’s write / read
      fences. Perhaps it’s even sequentially consistent. And the label of “store / release”
      could actually be a white lie, as with the CLR’s memory model thanks to store buffer
      forwarding and the lack of fences in the CLR JIT’s x64 stores. 
   &lt;li&gt;
      Performance, man, performance! Sure sequential consistency as the default sounds nice
      on the tin, but once you’re running that mobile app on ARM, and sucking up 160 cycles
      for each write you perform, you’re going to curse volatile like the plague. 
&lt;/ol&gt;
&lt;p&gt;
   And so the moral of the story follows...
&lt;/p&gt;
&lt;h1&gt;Attempting to “fix” volatile is a waste of time
&lt;/h1&gt;
&lt;p&gt;
   Instead, a new world order has arrived. We must take a two-pronged approach to solving
   instruction-level interleaving bugs, neither prong having much to do with the traditional
   definition of memory models or volatiles. We must:
&lt;/p&gt;
&lt;ol&gt;
   &lt;li&gt;
      Eliminate memory ordering from 99% of developers’ purviews. This is already the case
      with single-threaded programs, because code motion in compilers and processors is
      limited to what only affects concurrent observations. So the answer is pretty clear:
      developers must move towards single-threaded programming models connected through
      message passing, optionally with provably race-free fine-grained parallelism inside
      of those single-threaded worlds. 
   &lt;li&gt;
      Leave the memory model esoterica to the Einsteins, and radically change its meaning.
      Data dependence and transitive visibility of memory operations are in. Volatiles on
      storage locations are out. Instead, we must throw fences into programmer’s faces,
      and force them to understand each and every one that occurs. And moreover, force them
      to decide about each and every one that occurs. Specifically, hidden fences thanks
      to volatile are no longer. Those who cannot take it should fall into the 99% bucket
      already mentioned above, versus the 1% bucket. 
&lt;/ol&gt;
&lt;p&gt;
   Let’s set #1 aside for now, since it’s obviously a huge can of worms.
&lt;/p&gt;
&lt;p&gt;
   But what about #2? It is quite encouraging that the C++0x group is firmly on the path
   of #2. See &lt;a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2145.html"&gt;http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2145.html&lt;/a&gt; for
   more details. In a nutshell, each location that you’d have ordinarily tagged volatile
   instead becomes a template atomic&lt;T&gt;
      type. And then each read / write has the opportunity to specify the desired kind of
      fence, whether that is acquire (for reads), release (for writes), fully ordered, or
      relaxed (meaning no fence).
&lt;/p&gt;
&lt;p&gt;
   I do think it’d be worth them considering compiler-only fences too. So that relaxed
   means fully-relaxed, and there is a fence in-between that merely prevents the compiler
   from optimizing the memory operation. This pays homage to volatile’s legacy in C as
   merely a variable that mustn’t be subject to optimizations, because operations against
   these variables pertain to, say, memory-mapped device I/O.
&lt;/p&gt;
&lt;p&gt;
   Another nitpick of mine is that I’d have required each access to specify the fence,
   whereas C++0x implicitly uses full fence if left unspecified at the callsite. It’s
   a minor convenience, but I like always having the fence spelled out very explicitly
   in the code. Lock-free access to shared variables is sufficiently dangerous that automatic
   sequential consistency is the least of your problems.
&lt;/p&gt;
&lt;p&gt;
   Nevertheless, the C++0x direction is a massive good step forward, and these are just
   minor details.
&lt;/p&gt;
&lt;p&gt;
   My hope is that .NET follows suit. And the timing couldn’t be more apropos as “now”:
   we are moving forward in a heavily mobile, distributed-system-on-a-chip, and heterogeneous
   world, where processor memory models will necessarily continue to weaken. The overly
   strong x86 memory model, kept alive primarily to ensure compatibility, has simply
   grown too expensive to accommodate. The power benefits and architectural simplifications
   are hard to argue with, and because compatibility becomes less of an issue as new
   platforms arise (e.g. for mobile), the world moves to the cloud, and hence there is
   legacy to worry about, I do hope that processor vendors seize the opportunity. ARM
   certainly has. It is less about out-of-order execution as it is about coherency costs.
   Truthfully, I’d be disappointed if anything else happened, even though the risk to
   compatibility for shrink-wrapped software scares the hell out of me. But this is most
   certainly the right way to go, long-term. As software platforms move in the direction
   of #1 – as I also foresee – the need for fences dwindles. The cost of supporting the
   current .NET memory model is too great and will become a liability with time.
&lt;/p&gt;
&lt;p&gt;
   Thankfully, it is quite simple to build a veneer atop .NET that works a lot more like
   atomic&lt;T&gt;
      . For example, imagine that we had a new System.Threading.Volatile static class, and
      that it offered the moral equivalent to atomic&lt;T&gt;
         inner types for each atomic primitive we can synchronize against:
&lt;/p&gt;
&lt;p&gt;
&lt;pre&gt;&lt;code&gt;namespace System.Threading { public static class Volatile { public struct
Int32 {..} public struct Int64 {..} … public struct Reference&lt;T&gt;
   where T : class {..} … } }
&lt;/code&gt;&lt;/pre&gt;
&gt;
&lt;p&gt;
   Now instead of tagging a location as ‘volatile’, you would use one of these primitives.
   For example, rather than:
&lt;/p&gt;
&lt;p&gt;
&lt;pre&gt;&lt;code&gt;static volatile MySingleton s_instance;&lt;/code&gt;&lt;/pre&gt;
&gt;
&lt;p&gt;
   You would say:
&lt;/p&gt;
&lt;p&gt;
&lt;pre&gt;&lt;code&gt;static Volatile.Reference&amp;lt;MySingleton&amp;gt; s_instance;&lt;/code&gt;&lt;/pre&gt;
&gt;
&lt;p&gt;
   Each class has a similar set of operations. For example:
&lt;/p&gt;
&lt;p&gt;
&lt;pre&gt;&lt;code&gt;namespace System.Threading { public static class Volatile { public struct
Int32 { public Int32(int value); public int ReadAcquireFence(); public int ReadFullFence();
public int ReadCompilerOnlyFence(); public int ReadUnfenced(); public void WriteReleaseFence(int
newValue); public void WriteFullFence(int newValue); public void WriteCompilerOnlyFence(int
newValue); public void WriteUnfenced(int newValue); public int AtomicCompareExchange(int
newValue, int comparand); public int AtomicExchange(int newValue); public int AtomicAdd(int
delta); public int AtomicIncrement(); public int AtomicDecrement(); // Etc…, bitwise
ops, other math ops, etc. } } }&lt;/code&gt;&lt;/pre&gt;
&gt;
&lt;p&gt;
   Of course, only the integer types would offer the increment, decrement, add, and related
   operators. And it turns out that offering different kinds of fences on the Atomic*
   operations would be incredibly useful too, because processors like ARM do not couple
   the fence to the compare-and-swap / load-locked-store-conditional as x86 processors
   do. Taking advantage of this can be huge if you are writing performance critical code,
   like a concurrent garbage collector whose atomic swaps need not imply ordering with
   the surrounding instruction stream. You can quibble over the details, like whether
   these should use enums instead of the name to encode the fence-kind. I did it this
   way to keep the implementations branch-free, although with a decent inlining JIT compiler,
   it’d probably optimize those away thanks to constant propagation.
&lt;/p&gt;
&lt;p&gt;
   It’s quite trivial to implement these APIs atop existing .NET primitives. I built
   a little library that does so, but it was so boring and repetitive I decided not to
   post it alongside this blog entry as originally intended.
&lt;/p&gt;
&lt;p&gt;
   With the above definition, we can very clearly see the fences involved in doing, say,
   double-checked locking:
&lt;/p&gt;
&lt;p&gt;
&lt;pre&gt;&lt;code&gt;static Volatile.Reference&amp;lt;MySingleton&amp;gt; s_instance; public static
MySingleton Instance { get { MySingleton instance = s_instance.ReadAcquireFence();
if (instance == null) { instance = new MySingleton(); instance = s_instance.AtomicCompareExchangeRelease(instance,
null); } return instance; } }&lt;/code&gt;&lt;/pre&gt;
&gt;
&lt;p&gt;
   We see there are two fences. One is an acquire and, depending on what your memory
   model says about data dependence, is probably unnecessary. Most sane memory models
   guarantee that data dependent loads do not pass. So we needn’t worry that we’ll see
   a non-null s_instance whose contents haven’t been initialized. (If we were talking
   structs, it’d be another story.) Nevertheless, it’s definitely required that we use
   a release-only fence for the publication of the object. This guarantees writes to
   fields within the MySingleton constructor have completed prior to the write of the
   new object to the shared instance field. The point here is that you are forced to
   think about the fences, and you actually see them.
&lt;/p&gt;
&lt;p&gt;
   Of course, most platforms need to provide the bare minimum of fencing to assure type
   safety, particularly for languages like C#. My understanding is that C++0x has decided,
   at least for now, not to offer type-safety in the face of multithreading. That means
   you might publish an object and, if stores occur out-of-order, the reader could see
   an object partially initialized with an invalid vtable pointer. In C# and Java, the
   language designers have thankfully decided to shield programmers from this. The need
   for fences also extends to unsafe code like strings, where – were it possible for
   a thread to read the non-zero length before the char* pointer was valid – writes to
   random memory could occur and hence threaten type-safety. Thankfully, again, C# and
   Java protect developers from this, mostly due to the automatic zero’ing of memory
   as the GC allocates and hands out new segments.
&lt;/p&gt;
&lt;p&gt;
   There are costs to offering this type safety assurance. So you can understand why
   the C++ designers want to keep fences out of object allocation. If you have #1 above,
   however, the costs are dramatically lower and more acceptable. But the world is –
   unfortunately – still a freethreaded one, and we have several years to go before we’ve
   reached the final destination. As a step forward, however, the death of volatile is
   a welcomed one. Say it with me.
&lt;/p&gt;
&lt;h1&gt;“Sayonara volatile.”
&lt;/h1&gt;
&lt;p&gt;
   Here’s hoping that .NET 5.0 takes this step forward too.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=a0484627-c752-45f6-a2ac-414130cb3d2f" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=558a8e4d-efe4-4f61-99e1-f46f8af76e0e</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,558a8e4d-efe4-4f61-99e1-f46f8af76e0e.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      I rambled on and on a few weeks back about how much performance matters. Although
      I got a lot of contrary feedback, most of it had to do with my deliberately controversial
      title than the content of the article. I had intended to post a redux, but something
      more concise is on my mind lately.
   </p>
        <p>
      GC-based memory management is a boon to productivity, not to mention program safety.
      Few would argue with this. However, the most effective developers know how their particular
      GC works, and optimize their program’s data structure, allocation, and lifetime behavior
      to suit their particular GC best.
   </p>
        <p>
      This is dangerous, but a pragmatic fact of life. It is dangerous because who’s to
      say that the runtime team doesn’t intend to entirely revamp the GC’s collection strategy
      next release, at which point your thoughtfulness may actually harm you? It’s a pragmatic
      fact for a few reasons: it’s probably not likely that the behavior of your favorite
      GC is going to change too fundamentally over time; if it did, you’d need to rethink
      things anyhow; oh, and when was that next release anyway (2 or more years out); and
      finally, what do you care about more, the theoretical loose coupling, or real results
      today?
   </p>
        <p>
      One of the worst data structures for traversal is a linked list. That’s because its
      contents are fetched by pointer-chasing, an act that usually destroys locality, unless
      the data associated with each pointer was carefully constructed to live next to its
      previous and next pointer’s data. This seldom happens, because the main point of a
      linked list is to free you from such constraints.
   </p>
        <p>
      One of the best data structures for traversal, on the other hand, is an array. Adjacent
      elements are truly adjacent to one another in memory, meaning that as you fetch the
      i’th element from memory, you’re probably pulling in the i’th, i+1’th, …, and so on,
      thanks to spatial locality. Of course, if the elements are just pointers, then you're
      back to the chasing game; as with anything, it depends.
   </p>
        <p>
      How many elements you prefetch of course depends on the size of the elements with
      respect to your processor’s cache line. If you’re working with 8-byte elements, and
      128-byte cache lines, then you may pay 100 cycles for the first fetch, and then amortize
      that cost over the subsequent 15 found cheaply in cache for 10 cycles. The result
      is about 250 cycles total; compare this to a linked list, where you’ll probably spend
      1,600 cycles, or more than 6X the cost. And of course you’re trashing other data in
      the cache in the process. As you traverse more and more of the list, the numbers snowball,
      and the amortization of locality, or lack thereof, provides a stark contrast.
   </p>
        <p>
      There’s another subtle reason why this is important. Stop and think about what happens
      when a GC occurs.
   </p>
        <p>
      Yep, that’s right. Your data structures need to be traversed during a GC, after all,
      to ascertain the liveness of any pointers held within. That scan looks a whole heck
      of a lot like the same traversal I just described, and enjoys the same locality properties.
      So we can immediately conclude that data structures whose traversal is efficient will
      translate into less time spent in the GC chasing pointers, and better cache efficiency.
   </p>
        <p>
      For programs that are sensitive to long pause times, this is huge. I talk to customers
      all the time whose programs are sensitive to microsecond-long GC delays, and – aside
      from ensuring good GC lifetime practices, like ensuring all objects either die young
      or live long – being conscious about locality can be immensely important. Especially
      for any long-lived, large data structures that will be subject to Gen2 collections
      throughout their lifetime.
   </p>
        <p>
      There is another useful trick to know. If a data structure contains no pointers, the
      GC will not have to trace these pointers. Obvious, right? A linked list inherently
      contains pointers, so this trick really doesn’t apply: the GC will need to traverse
      the whole live portions of the object graph. What’s interesting is that an array,
      on the other hand, may or may not contain elements that contain pointers. For example,
      an array of ints clearly has no pointers, whereas an array of string references clearly
      does. This doesn’t just apply to the primitive types, but also custom structs which
      may or may not contain references. When the GC encounters such an array, its contents
      need not be traversed: instead, the array is alive, and that's that. Yet another opportunity
      to eliminate pointer chasing. Not only does this save the GC from doing some heavy
      lifting, but the pointer-free structs eliminate the need for the GC write barriers
      on array stores too.
   </p>
        <p>
      So think of all this next time you’re confronted with the decision to employ a tree,
      graph, or linked list, and whether a dense, and perhaps pointer-free, representation
      could be beneficial. Even if it means you must replace pointers with index calculations.
      The locality benefits may not matter, but then again, they may. And at least you can
      knowingly make a balanced tradeoff, with these potential advantages in mind.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=558a8e4d-efe4-4f61-99e1-f46f8af76e0e" />
      </body>
      <title>Dense, and pointer-free</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,558a8e4d-efe4-4f61-99e1-f46f8af76e0e.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2010/11/01/DenseAndPointerfree.aspx</link>
      <pubDate>Mon, 01 Nov 2010 00:41:41 GMT</pubDate>
      <description>&lt;p&gt;
   I rambled on and on a few weeks back about how much performance matters. Although
   I got a lot of contrary feedback, most of it had to do with my deliberately controversial
   title than the content of the article. I had intended to post a redux, but something
   more concise is on my mind lately.
&lt;/p&gt;
&lt;p&gt;
   GC-based memory management is a boon to productivity, not to mention program safety.
   Few would argue with this. However, the most effective developers know how their particular
   GC works, and optimize their program’s data structure, allocation, and lifetime behavior
   to suit their particular GC best.
&lt;/p&gt;
&lt;p&gt;
   This is dangerous, but a pragmatic fact of life. It is dangerous because who’s to
   say that the runtime team doesn’t intend to entirely revamp the GC’s collection strategy
   next release, at which point your thoughtfulness may actually harm you? It’s a pragmatic
   fact for a few reasons: it’s probably not likely that the behavior of your favorite
   GC is going to change too fundamentally over time; if it did, you’d need to rethink
   things anyhow; oh, and when was that next release anyway (2 or more years out); and
   finally, what do you care about more, the theoretical loose coupling, or real results
   today?
&lt;/p&gt;
&lt;p&gt;
   One of the worst data structures for traversal is a linked list. That’s because its
   contents are fetched by pointer-chasing, an act that usually destroys locality, unless
   the data associated with each pointer was carefully constructed to live next to its
   previous and next pointer’s data. This seldom happens, because the main point of a
   linked list is to free you from such constraints.
&lt;/p&gt;
&lt;p&gt;
   One of the best data structures for traversal, on the other hand, is an array. Adjacent
   elements are truly adjacent to one another in memory, meaning that as you fetch the
   i’th element from memory, you’re probably pulling in the i’th, i+1’th, …, and so on,
   thanks to spatial locality. Of course, if the elements are just pointers, then you're
   back to the chasing game; as with anything, it depends.
&lt;/p&gt;
&lt;p&gt;
   How many elements you prefetch of course depends on the size of the elements with
   respect to your processor’s cache line. If you’re working with 8-byte elements, and
   128-byte cache lines, then you may pay 100 cycles for the first fetch, and then amortize
   that cost over the subsequent 15 found cheaply in cache for 10 cycles. The result
   is about 250 cycles total; compare this to a linked list, where you’ll probably spend
   1,600 cycles, or more than 6X the cost. And of course you’re trashing other data in
   the cache in the process. As you traverse more and more of the list, the numbers snowball,
   and the amortization of locality, or lack thereof, provides a stark contrast.
&lt;/p&gt;
&lt;p&gt;
   There’s another subtle reason why this is important. Stop and think about what happens
   when a GC occurs.
&lt;/p&gt;
&lt;p&gt;
   Yep, that’s right. Your data structures need to be traversed during a GC, after all,
   to ascertain the liveness of any pointers held within. That scan looks a whole heck
   of a lot like the same traversal I just described, and enjoys the same locality properties.
   So we can immediately conclude that data structures whose traversal is efficient will
   translate into less time spent in the GC chasing pointers, and better cache efficiency.
&lt;/p&gt;
&lt;p&gt;
   For programs that are sensitive to long pause times, this is huge. I talk to customers
   all the time whose programs are sensitive to microsecond-long GC delays, and – aside
   from ensuring good GC lifetime practices, like ensuring all objects either die young
   or live long – being conscious about locality can be immensely important. Especially
   for any long-lived, large data structures that will be subject to Gen2 collections
   throughout their lifetime.
&lt;/p&gt;
&lt;p&gt;
   There is another useful trick to know. If a data structure contains no pointers, the
   GC will not have to trace these pointers. Obvious, right? A linked list inherently
   contains pointers, so this trick really doesn’t apply: the GC will need to traverse
   the whole live portions of the object graph. What’s interesting is that an array,
   on the other hand, may or may not contain elements that contain pointers. For example,
   an array of ints clearly has no pointers, whereas an array of string references clearly
   does. This doesn’t just apply to the primitive types, but also custom structs which
   may or may not contain references. When the GC encounters such an array, its contents
   need not be traversed: instead, the array is alive, and that's that. Yet another opportunity
   to eliminate pointer chasing. Not only does this save the GC from doing some heavy
   lifting, but the pointer-free structs eliminate the need for the GC write barriers
   on array stores too.
&lt;/p&gt;
&lt;p&gt;
   So think of all this next time you’re confronted with the decision to employ a tree,
   graph, or linked list, and whether a dense, and perhaps pointer-free, representation
   could be beneficial. Even if it means you must replace pointers with index calculations.
   The locality benefits may not matter, but then again, they may. And at least you can
   knowingly make a balanced tradeoff, with these potential advantages in mind.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=558a8e4d-efe4-4f61-99e1-f46f8af76e0e" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=d3af3274-061f-4cb4-889f-fa92bf6792dc</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,d3af3274-061f-4cb4-889f-fa92bf6792dc.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      I have several positions open on my team here at Microsoft.
   </p>
        <p>
      My team's responsibility spans multiple aspects of a research operating system’s programming
      model. The three main areas are concurrency, languages, and frameworks. When I say
      concurrency, I mean things like asynchrony and message passing, data and task parallelism,
      distributed parallelism, runtime scheduling and resource management, and heterogeneity
      and GPGPU. When I say languages, I mean type systems, mostly-functional programming,
      verified safe concurrency, and both front- and back-end compilation. And when I say
      frameworks, I mean virtually anything you could imagine wanting out of a platform
      framework: all things XML, data interoperability (database, web services, etc.), collections,
      transactions, multimaster synchronization, and even low level things, like regex,
      numerics, and globalization.
   </p>
        <p>
      Our team is 100% developers, and we have an “everybody codes, everybody loves to code”
      culture. Even managers are expected to spend a significant amount of time prolifically
      writing code.
   </p>
        <p>
      All of these components are new and built from the ground up. So self-drive and an
      ability to have a vision and make it happen are incredibly important.
   </p>
        <p>
      We are always happy to hire great, hard-working people, regardless of years of experience.
      If you’re extremely strong in one or more of the abovementioned areas, more of a generalist,
      are an amazing coder, or all of the above, you’d fit in perfectly. This is the most
      amazing team of people I’ve ever worked with. If you are interested, please email
      your resume to me at joedu AT microsoft DOT com.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=d3af3274-061f-4cb4-889f-fa92bf6792dc" />
      </body>
      <title>We are hiring</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,d3af3274-061f-4cb4-889f-fa92bf6792dc.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2010/09/18/WeAreHiring.aspx</link>
      <pubDate>Sat, 18 Sep 2010 18:42:27 GMT</pubDate>
      <description>&lt;p&gt;
   I have several positions open on my team here at Microsoft.
&lt;/p&gt;
&lt;p&gt;
   My team's responsibility spans multiple aspects of a research operating system’s programming
   model. The three main areas are concurrency, languages, and frameworks. When I say
   concurrency, I mean things like asynchrony and message passing, data and task parallelism,
   distributed parallelism, runtime scheduling and resource management, and heterogeneity
   and GPGPU. When I say languages, I mean type systems, mostly-functional programming,
   verified safe concurrency, and both front- and back-end compilation. And when I say
   frameworks, I mean virtually anything you could imagine wanting out of a platform
   framework: all things XML, data interoperability (database, web services, etc.), collections,
   transactions, multimaster synchronization, and even low level things, like regex,
   numerics, and globalization.
&lt;/p&gt;
&lt;p&gt;
   Our team is 100% developers, and we have an “everybody codes, everybody loves to code”
   culture. Even managers are expected to spend a significant amount of time prolifically
   writing code.
&lt;/p&gt;
&lt;p&gt;
   All of these components are new and built from the ground up. So self-drive and an
   ability to have a vision and make it happen are incredibly important.
&lt;/p&gt;
&lt;p&gt;
   We are always happy to hire great, hard-working people, regardless of years of experience.
   If you’re extremely strong in one or more of the abovementioned areas, more of a generalist,
   are an amazing coder, or all of the above, you’d fit in perfectly. This is the most
   amazing team of people I’ve ever worked with. If you are interested, please email
   your resume to me at joedu AT microsoft DOT com.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=d3af3274-061f-4cb4-889f-fa92bf6792dc" /&gt;</description>
      <category>Miscellaneous;Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=4db70333-295b-441f-80f9-21b90bd44287</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,4db70333-295b-441f-80f9-21b90bd44287.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <title>The 'premature optimization is evil' myth</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,4db70333-295b-441f-80f9-21b90bd44287.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2010/09/06/ThePrematureOptimizationIsEvilMyth.aspx</link>
      <pubDate>Mon, 06 Sep 2010 18:38:49 GMT</pubDate>
      <description>&lt;p&gt;
   I can’t tell you how many times I’ve heard the age-old adage echoed inappropriately
   and out of context:
&lt;/p&gt;
&lt;p&gt;
   &lt;i&gt;&lt;blockquote&gt;"We should forget about small efficiencies, say about 97% of the time;
   premature optimization is the root of all evil"&lt;br&gt;
   -- Donald E. Knuth, &lt;a href="http://portal.acm.org/citation.cfm?id=356640"&gt;Structured
   Programming with go to Statements&lt;/a&gt;&lt;/blockquote&gt;&lt;/i&gt;
&lt;/p&gt;
&lt;p&gt;
   I have heard the "premature optimization is the root of all evil" statement used by
   programmers of varying experience at every stage of the software lifecycle, to defend
   all sorts of choices, ranging from poor architectures, to gratuitous memory allocations,
   to inappropriate choices of data structures and algorithms, to complete disregard
   for variable latency in latency-sensitive situations, among others.
&lt;/p&gt;
&lt;p&gt;
   Mostly this quip is used defend sloppy decision-making, or to justify the indefinite
   deferral of decision-making. In other words, laziness. It is safe to say that the
   very mention of this oft-misquoted phrase causes an immediate visceral reaction to
   commence within me... and it’s not a pleasant one.
&lt;/p&gt;
&lt;p&gt;
   In this short article, we’ll look at some important principles that are counter to
   what many people erroneously believe this statement to be saying. To save you time
   and suspense, I will summarize the main conclusions: I do not advocate contorting
   oneself in order to achieve a perceived minor performance gain. Even the best performance
   architects, when following their intuition, are wrong 9 times out of 10 about what
   matters. (Or maybe 97 times out of 100, based on Knuth’s numbers.) What I do advocate
   is thoughtful and intentional performance tradeoffs being made as every line of code
   is written. Always understand the order of magnitude that matters, why it matters,
   and where it matters. And measure regularly! I am a big believer in statistics, so
   if a programmer sitting in his or her office writing code thinks just a little bit
   more about the performance implications of every line of code that is written, he
   or she will save an entire team that time and then some down the road. Given the choice
   between two ways of writing a line of code, both with similar readability, writability,
   and maintainability properties, and yet interestingly different performance profiles,
   don’t be a bozo: choose the performant approach. Eschew redundant work, and poorly
   written code. And lastly, avoid gratuitously abstract, generalized, and allocation-heavy
   code, when slimmer, more precise code will do the trick.
&lt;/p&gt;
&lt;p&gt;
   Follow these suggestions and your code will just about always win in both maintainability
   and performance.
&lt;/p&gt;
&lt;h1&gt;Understand the order of magnitude that matters
&lt;/h1&gt;
&lt;p&gt;
   First and foremost, you really ought to understand what order of magnitude matters
   for each line of code you write.
&lt;/p&gt;
&lt;p&gt;
   In other words, you need to have a budget; what can you afford, and where can you
   afford it? The answer here changes dramatically depending on whether you’re writing
   a device driver, reusable framework library, UI control, highly-connected network
   application, installation script, etc. No single answer fits all.
&lt;/p&gt;
&lt;p&gt;
   I am personally used to writing code where 100 CPU cycles matters. So invoking a function
   that acquires a lock by way of a shared-memory interlocked instruction that may take
   100 cycles is something I am apt to think hard about; even more worrisome is if that
   acquisition could block waiting for 100,000 cycles. Indeed this situation could become
   disastrous under load. As you can tell, I write a lot of systems code. If you’re working
   on a network-intensive application, on the other hand, most of the code you write
   is going to be impervious to 100 cycle blips, and more sensitive to efficient network
   utilization, scalability, and end-to-end performance. And if you’re writing a little
   one-time script, or some testing or debugging program, you may get away with ignoring
   performance altogether, even multi-million cycle network round-trips.
&lt;/p&gt;
&lt;p&gt;
   To be successful at this, you’ll need to know what things cost. If you don’t know
   what things cost, you’re just flailing in the dark, hoping to get lucky. This includes
   rule of thumb order of magnitudes for primitive operations – e.g. reading / writing
   a register (nanoseconds, single-digit cycles), a cache hit (nanoseconds, tens of cycles),
   a cache miss to main memory (nanoseconds, hundreds of cycles), a disk access including
   page faults (micro- or milliseconds, millions of cycles), and a network roundtrip
   (milliseconds or seconds, many millions of cycles) – in addition to peering beneath
   opaque abstractions provided by other programmers, to understand their best, average,
   and worst case performance.
&lt;/p&gt;
&lt;p&gt;
   Clearly the concerns and situations you must work to avoid change quite substantially
   depending on the class of code you are writing, and whether the main function of your
   program is delivering a user experience (where usability reigns supreme), delivering
   server-side throughput, etc. Thinking this through is crucial, because it helps avoid
   true "premature optimization" traps where a programmer ends up writing complicated
   and convoluted code to save 10 cycles, when he or she really needs to be thinking
   about architecting the interaction with the network more thoughtfully to asynchronously
   overlap round-trips. Understanding how performance impacts the main function of your
   program drives all else.
&lt;/p&gt;
&lt;p&gt;
   Pay attention to interoperability between layers of separately authored software that
   is composed together. The most common cause of hangs is that an API didn’t specify
   the expected performance, and so a UI programmer ended up using it in an innocuous
   but inappropriate way, because they couldn’t afford the range of order of magnitude
   cost that the API’s performance was expected to fall within. Hangs aren’t the only
   manifestation; O(N^2), or worse, performance can also result, if, for example, a caller
   didn’t realize the function called was going to enumerate a list in order to generate
   its results.
&lt;/p&gt;
&lt;p&gt;
   It is also important to think about worst case situations. What happens if that lock
   is held for longer than expected, because the system is under load and the scheduler
   is overloaded? And what if the owning thread was preempted while holding the lock,
   and now will not get to run again for quite some time? What happens if the network
   is saturated because a big news event is underway, or worse, the phone network is
   intermittently cutting out, the network cable has been unplugged, etc.? What about
   the case where, because a user has launched far too many applications at once, your
   memory-intensive operation that usually enjoys nice warmth and locality suddenly begins
   waiting for the disk on the majority of its memory accesses, due to demand paging?
   These things happen all the time.
&lt;/p&gt;
&lt;p&gt;
   In each of these situations, you can end up paying many more orders of magnitude in
   cost than you expected under ordinary circumstances. The lock acquisition that usually
   took 100 CPU cycles now takes several million cycles (as long as a network roundtrip),
   and the network operation that is usually measured in milliseconds is now measured
   in tens of seconds, as the software painfully waits for the operation to time out.
   And your "non-blocking" memory-intensive algorithm on the UI thread just caused a
   hang, because it’s paging like crazy.
&lt;/p&gt;
&lt;p&gt;
   You’ve experienced these problems as a user of modern software, I am sure, and it
   isn’t fun. An hourglass, spinning donut, unresponsive button clicks, "(Not Responsive)"
   title bars, and bleachy white screens. An important measurement of a programmer’s
   worth is how good the code they write operates under the extreme and rare circumstances.
   Because, truth be told, when you have a large user-base, these circumstances aren’t
   that rare after all. This is more of a "performance in the large" thing, but it turns
   out that the end result is delivered as a result of many "performance in the small"
   decisions adding up. A developer wrote code meant to be used in a particular way,
   but decided what order of magnitude was reasonable based on best case, … and gave
   no thought to the worst case.
&lt;/p&gt;
&lt;h1&gt;Using the right data structure for the job
&lt;/h1&gt;
&lt;p&gt;
   This is motherhood and apple pie, Computer Science 101, … bad clichés abound. And
   yet so many programmers get this wrong, because they simply don’t give it enough thought.
&lt;/p&gt;
&lt;p&gt;
   One of my favorite books on the topic ("Programming Pearls") has this to say about
   them:
&lt;/p&gt;
&lt;p&gt;
   &lt;i&gt;&lt;blockquote&gt;"Most programmers have seen them, and most good programmers realize
   they’ve written at least one.&lt;br&gt;
   They are huge, messy, ugly programs that should have been short, clean, beautiful
   programs."&lt;/blockquote&gt;&lt;/i&gt;
&lt;/p&gt;
&lt;p&gt;
   I’ll add one adjective to the "short, clean, beautiful" list: fast.
&lt;/p&gt;
&lt;p&gt;
   Data structures drive storage and access behavior, both strongly affecting the size
   and speed of algorithms and components that make use of them. Worst case really does
   matter. This too is a case where the right choice will boost not only performance
   but also the cleanliness of the program.
&lt;/p&gt;
&lt;p&gt;
   I’m actually not going to spend too much time on this; when I said this is CS101,
   I meant it. However, it is crucial to be intentional and smart in this choice. Validate
   assumptions, and measure.
&lt;/p&gt;
&lt;p&gt;
   Ironically, in my experience, many senior programmers can make frighteningly bad data
   structure choices, often because they are more apt to choose a sophisticated and yet
   woefully inappropriate one. They may choose a linked list, for example, because they
   want zero-allocation element linking via an embedded next pointer. And yet they then
   end up with many lists traversals throughout the program, where a dense array representation
   would have been well worth the extra allocation. The naïve programmer would have happily
   new’d up a List&amp;lt;T&amp;gt;, and avoided some common pitfalls; yet, here the senior guy
   is working as hard as humanly possible to avoid a single extra allocation. They over-optimized
   in one dimension, and ignored many others that mattered more.
&lt;/p&gt;
&lt;p&gt;
   This same class of programmer may choose a very complicated lock-free data structure
   for sharing elements between threads, incurring many more object allocations (and
   thus increased GC pressure), and a large number of expensive interlocked operations
   scattered throughout the code. The sexy lure of lock-freedom tricked them into making
   a bad choice. Perhaps they didn’t quite understand that locks and lock-free data structures
   share many costs in common. Or perhaps they just hoped to get lucky and squeeze out
   out-of-this-world scalability thanks to lock-freedom, without actually considering
   the access patterns necessary to lead to such scalability and whether their program
   employed them.
&lt;/p&gt;
&lt;p&gt;
   These are often held up as examples of "premature optimization", but I hold them up
   as examples of "careless optimization". The double kicker here is that the time spent
   building the more complicated solution would have been better spent carefully thinking
   and measuring, and ultimately deciding not to be overly clever in the first place.
   This most often plagues the mid-range programmer, who is just smart enough to know
   about a vast array of techniques, but not yet mature enough to know when not to employ
   them.
&lt;/p&gt;
&lt;h1&gt;A different, better-performing approach
&lt;/h1&gt;
&lt;p&gt;
   It’s an all-too-common occurrence. I’ll give code review feedback, asking "Why didn’t
   you take approach B? It seems to be just as clear, and yet obviously has superior
   performance." Again, this is in a circumstance where I believe the difference matters,
   given the order of magnitude that matters for the code in question. And I’ll get a
   response, "Premature optimization is the root of all evil." At which point I want
   to smack this certain someone upside the head, because it’s such a lame answer.
&lt;/p&gt;
&lt;p&gt;
   The real answer is that the programmer didn’t stop to carefully consider alternatives
   before coding up solution A. (To be fair, sometimes good solutions evade the best
   of us.) The reality is that the alternative approach should have been taken; it may
   be true that it’s "too late" because the implications of the original decision were
   nontrivial and perhaps far-reaching, but that is too often an unfortunate consequence
   of not taking the care and thought to do it right in the first place.
&lt;/p&gt;
&lt;p&gt;
   These kinds of "peanut butter" problems add up in a hard to identify way. Your performance
   profiler may not obviously point out the effect of such a bad choice so that it’s
   staring you in your face. Rather than making one routine 1000% slower, you may have
   made your entire program 3% slower. Make enough of these sorts of decisions, and you
   will have dug yourself a hole deep enough to take a considerable percentage of the
   original development time just digging out. I don’t know about you, but I prefer to
   clean my house incrementally and regularly rather than letting garbage pile up to
   the ceilings, with flies buzzing around, before taking out the trash. Simply put,
   all great software programmers I know are proactive in writing clear, clean, and smart
   code. They pour love into the code they write.
&lt;/p&gt;
&lt;p&gt;
   In this day and age, where mobility and therefore power is king, instructions matter.
   My boss is fond of saying "the most performant instruction is the one you didn’t have
   to execute." And it’s true. The best way to save battery power on mobile phones is
   to execute less code to get the same job done.
&lt;/p&gt;
&lt;p&gt;
   To take an example of a technology that I am quite supportive of, but that makes writing
   inefficient code very easy, let’s look at LINQ-to-Objects. Quick, how many inefficiencies
   are introduced by this code?
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt;&lt;blockquote&gt;&lt;pre&gt;&lt;= hi)
                  select (x * c);
    return results.ToArray();
}
&lt;/pre&gt;
int[] Scale(int[] inputs, int lo, int hi, int c) {
    var results = from x in inputs
                  where (x &gt;= lo) &amp;&amp; (x 
   &lt;/blockquote&gt;&lt;/code&gt;
&lt;/p&gt;
&lt;p&gt;
   It’s hard to account for them all.
&lt;/p&gt;
&lt;p&gt;
   There are two delegate object allocations, one for the call to Enumerable.Where and
   the other for the call to Enumerable.Select. These delegates point to potentially
   two distinct closure objects, each of which has captured enclosing variables. These
   closure objects are instances of new classes, which occupy nontrivial space in both
   the binary and at runtime. (And of course, the arguments are now stored in two places,
   must be copied to the closure objects, and then we must incur extra indirections each
   time we access them.) In all likelihood, the Where and Select operators are going
   to allocate new IEnumerable&lt;T&gt;
      and new IEnumerator&lt;T&gt;
         objects. For each element in the input, the Where operator will make two interface
         method calls, one to IEnumerator&lt;T&gt;
            .MoveNext and the other to IEnumerator&lt;T&gt;
               .get_Current. It will then make a delegate call, which is slightly more expensive
               than a virtual method call on the CLR. For each element for which the Where delegate
               returns ‘true’, the Select operator will have likewise made two interface method calls,
               in addition to another delegate invocation. Oh, and the implementations of these likely
               use C# iterators, which produce relatively fat code, and are implemented as state
               machines which will incur more overhead (switch statements, state variables, etc.)
               than a hand-written implementation.
&lt;/p&gt;
&lt;p&gt;
   Wow. And we aren’t even done yet. The ToArray method doesn’t know the size of the
   output, so it must make many allocations. It’ll start with 4 elements, and then keep
   doubling and copying elements as necessary. And we may end up with excess storage.
   If we end up with 33,000 elements, for example, we will waste about 128KB of dynamic
   storage (32,000 X 4-byte ints).
&lt;/p&gt;
&lt;p&gt;
   A programmer may have written this routine this way because he or she has recently
   discovered LINQ, or has heard of the benefits of writing code declaratively. And/or
   he or she may have decided to introduce a more general purpose implementation of a
   Scale API versus doing something specific to the use in the particular program that
   Scale will be immediately used in. This is a great example of why premature generalization
   is often at odds with writing efficient code.
&lt;/p&gt;
&lt;p&gt;
   Imagine an alternative universe, on the other hand, where Scale will only get used
   once and therefore we can take advantage of certain properties of its usage. Namely,
   perhaps the input array need not be preserved, and instead we can update the elements
   matching the criteria in place:
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt;&lt;blockquote&gt;&lt;pre&gt;&lt; inputs.Length; i++) {
        if ((inputs[i] &gt;&lt;= hi)) {
            inputs[i] *= c;
        }
    }
}
&lt;/pre&gt;
void ScaleInPlace(int[] inputs, int lo, int hi, int c) {
    for (int i = 0; i = lo) &amp; (inputs[i] 
   &lt;/blockquote&gt;&lt;/code&gt;
&lt;/p&gt;
&lt;p&gt;
   A quick-and-dirty benchmark shows this to be an order of magnitude faster. Again,
   is it an order of magnitude that you care about? Perhaps not. See my earlier thoughts
   on that particular topic. But if you care about costs in the 100s or 1000s of cycles
   range, you probably want to pay heed.
&lt;/p&gt;
&lt;p&gt;
   Now, I’m not trying to take potshots at LINQ. It was just an example. In fact, I spent
   3 years running a team that delivered PLINQ, a parallel execution engine for LINQ-to-Objects.
   LINQ is great where you can afford it, and/or where the alternatives do not offer
   ridiculously better performance. For example, if you can’t do in-place updates, functionally
   producing new data is going to require allocations no matter which way you slice it.
   But having watched people using PLINQ, I have witnessed numerous occasions where an
   inordinately expensive query was made 8-times faster by parallelizing… where the trivial
   refactoring into a slimmed down algorithm with proper data structures would have speed
   the code up by 100-fold. Parallelizing a piggy piece of code to make it faster merely
   uses more of the machine to get the same job done, and will negatively impact power,
   resource management, and utilization.
&lt;/p&gt;
&lt;p&gt;
   Another view is that writing code in this declarative manner is better, because it’ll
   just get faster as the compiler and runtimes enjoy new optimizations. This sounds
   nice, and seems like taking a high road of some sort. But what usually matters is
   today: how does the code perform using the latest and greatest technology available
   today. And if you scratch underneath the surface, it turns out that most of these
   optimizations are what I call "science fiction" and unlikely to happen. If you write
   redundant asserts at twenty adjacent layers of your program, well, you’re probably
   going to pay for them. If you allocate objects like they are cheap apples growing
   on trees, you’re going to pay for them. True, optimizations might make things faster
   over time, but usually not in the way you expect and usually not by orders of magnitude
   unless you are lucky.
&lt;/p&gt;
&lt;p&gt;
   A colleague of mine used to call C a WYWIWYG language—"what you write is what you
   get"—wherein each line of code roughly mapped one-to-one, in a self-evident way, with
   a corresponding handful of assembly instructions. This is a stark contrast to C#,
   wherein a single line of code can allocate many objects and have an impact to the
   surrounding code by introducing numerous silent indirections. For this reason alone,
   understanding what things cost and paying attention to them is admittedly more difficult
   – and arguably more important – in C# than it was back in the good ole’ C days. ILDASM
   is your friend … as is the disassembler. Yes, good systems programmers regularly look
   at the assembly code generated by the .NET JIT. Don’t assume it generates the code
   you think it does.
&lt;/p&gt;
&lt;h1&gt;Gratuitous memory allocations
&lt;/h1&gt;
&lt;p&gt;
   I love C#. I really do. I was reading my "Secure Coding in C and C++" book for fun
   this weekend, and it reminded me how many of those security vulnerabilities are eliminated
   by construction thanks to type- and memory-safety.
&lt;/p&gt;
&lt;p&gt;
   But the one thing I don’t love is how easy and invisible it makes heap allocations.
&lt;/p&gt;
&lt;p&gt;
   The very fact that C++ memory management is painful means most C++ programmers are
   overly-conscious about allocation and footprint. Having to opt-in to using pointers
   means developers are conscious about indirections, rather than having them everywhere
   by default. These are qualitative, hard-to-backup statements, but in my experience
   they are quite true. It’s also cultural.
&lt;/p&gt;
&lt;p&gt;
   Because they are so easy, it’s doubly important to be on the lookout for allocations
   in C#. Each one adds to a hard-to-quantify debt that must be repaid later on when
   a GC subsequently scans the heap looking for garbage. An API may appear cheap to invoke,
   but it may have allocated an inordinate amount of memory whose cost is only paid for
   down the line. This is certainly not "paying it forward."
&lt;/p&gt;
&lt;p&gt;
   It’s never been so easy to read a GB worth of data into memory and then haphazardly
   leave it hanging around for the GC to take care of at some indeterminate point in
   the future as it is in .NET. Or an entire list of said data. Too many times a .NET
   program holds onto GBs of working set, when a more memory conscientious approach would
   have been to devise an incremental loading strategy, employ a denser representation
   of this data, or some combination of the two. But, hey, memory is plentiful and cheap!
   And in the worst case, paging to disk is free! Right? Wrong. Think about the worst
   case.
&lt;/p&gt;
&lt;p&gt;
   Depending on the size of the objects allocated, how long they remain live, and how
   many processors are being used, the subsequent GCs necessary to clean up incessant
   allocations may impact a program in a difficult to predict way. Allocating a bunch
   of very large objects that live for long enough to make it out of the nursery, but
   not forever, for instance, is one of the worst evils you can do. This is known as
   mid-life crisis. You either want really short-lived objects or really long-lived ones.
   But in any case, it really matters: the LINQ example earlier shows how easy it is
   to allocate crap without seeing it in the code.
&lt;/p&gt;
&lt;p&gt;
   If I could do it all over again, I would make some changes to C#. I would try to keep
   pointers, and merely not allow you to free them. Indirections would be explicit. The
   reference type versus value type distinction would not exist; whether something was
   a reference would work like C++, i.e. you get to decide. Things get tricky when you
   start allocating things on the stack, because of the lifetime issues, so we’d probably
   only support stack allocation for a subset of primitive types. (Or we’d employ conservative
   escape analysis.) Anyway, the point here is to illustrate that in such a world, you’d
   be more conscious about how data is laid out in memory, encouraging dense representations
   over sparse and pointer rich data structures silently spreading all over the place.
   We don't live in this world, so pretend as though we do; each time you see a reference
   to an object, think "indirection!" to yourself and react as you would in C++ when
   you see a pointer dereference.
&lt;/p&gt;
&lt;p&gt;
   Allocations are not always bad, of course. It’s easy to become paranoid here. You
   need to understand your memory manager to know for sure. Most GC-based systems, for
   example, are heavily tuned to make successive small object allocations very fast.
   So if you’re programming in C#, you’ll get less "bang for your buck" by fusing a large
   number of contiguous object allocations into a small number of large object allocations
   that ultimately occupy an equivalent amount of space, particularly if those objects
   are short-lived. Lots of little garbage tends to be pretty painless, at least relatively.
&lt;/p&gt;
&lt;h1&gt;Variable latency and asynchrony
&lt;/h1&gt;
&lt;p&gt;
   There aren’t many ways to introduce a multisecond delay into your program at a moment’s
   notice. But I/O can do just that.
&lt;/p&gt;
&lt;p&gt;
   Code with highly variable latency is dangerous, because it can have dramatically different
   performance characteristics depending on numerous variables, many of which are driven
   by environmental conditions outside of your program’s control. As such it is immensely
   important to document where such variable latency can occur, and to program defensively
   against it happening.
&lt;/p&gt;
&lt;p&gt;
   For example, imagine a team of twenty programmers building some desktop application.
   The team is just large enough that no one person can understand the full details of
   how the entire system works. So you’ve got to compose many pieces together. (As I
   mentioned earlier, composition can lead to hard-to-predict performance characteristics.)
   Programmer Alice is responsible for serving up a list of fonts, and Programmer Bob
   is consuming that list to paint it on the UI. Does Bob know what it takes to fetch
   the list of fonts? Probably not. Does Alice know the full set of concerns that Bob
   must think about to deliver a responsive UI, like incremental repaints, progress reporting,
   and cancellation? Probably not. So Alice does the best she knows how to do: she hits
   the cache, when the font cache is fully populated, and falls back to fetching fonts
   from the printer otherwise. She returns a List&amp;lt;Font&amp;gt; object from her API. Now
   Bob just calls the API and paints the results on the UI; the call appears to be quite
   snappy in his testing. Unfortunately, when the cache is not populated, the "quick"
   cache hit turns into a series of network roundtrips with the printer; that hangs the
   UI, but only for a few milliseconds. Even worse, when the printer is accidentally
   offline, perhaps due to a power outage, the UI freezes for twenty seconds, because
   that’s the hard-coded timeout. Ouch!
&lt;/p&gt;
&lt;p&gt;
   This situation happens all the time. It’s one of the most common causes of UI hangs.
&lt;/p&gt;
&lt;p&gt;
   If you’re programming in an environment where asynchrony is first class, Alice could
   have advertised the fact that, under worst-case circumstances, fetching the font list
   would take some time. If she were programming in .NET, for example, she’d return a
   Task&amp;lt;List&amp;lt;Font&amp;gt;&amp;gt; rather than a List&amp;lt;Font&amp;gt;. The API would then be
   self-documenting, and Bob would know that waiting for the task’s result is dangerous
   business. He knows, of course, that blocking the UI thread often leads to responsiveness
   problems. So he would instead use the ContinueWith API to rendezvous with the results
   once they become available. And Bob may now know he needs to go back and work more
   closely with Alice on this interface: to ensure cancellation is wired up correctly,
   and to design a richer interface that facilitates incremental painting and progress
   reporting.
&lt;/p&gt;
&lt;p&gt;
   Variable latency is not just problematic for responsiveness reasons. If I/O is expressed
   synchronously, a program cannot efficiently overlap many of them. Imagine we must
   make three network calls as part of completing an operation, and that each one will
   take 25 milliseconds to complete. If we do synchronous I/O, the whole operation will
   take at least 75 milliseconds. If we launch do asynchronous I/O, on the other hand,
   the operation may take as few as 25 milliseconds to finish. That’s huge.
&lt;/p&gt;
&lt;p&gt;
   If I had my druthers, all I/O would be asynchronous. But that’s not where we are today.
&lt;/p&gt;
&lt;p&gt;
   The concern is not limited to just I/O, of course. Compute- and memory-bound work
   can quickly turn into variable latency work, particularly under stressful situations
   like when an application is paging. So truthfully any abstraction doing "heavy lifting"
   should offer an asynchronous alternative.
&lt;/p&gt;
&lt;h1&gt;Examples of "bad" optimizations
&lt;/h1&gt;
&lt;p&gt;
   It is easy to take it too far. Even if you’re shaving off cycles where each-and-every
   cycle matters, you may be doing the wrong thing. If anything, I hope this article
   has convinced you to be thoughtful, and to strive to strike a healthy balance between
   beautiful code and performance.
&lt;/p&gt;
&lt;p&gt;
   Anytime the optimization sacrifices maintainability, it is highly suspect. Indeed,
   many such optimizations are superficial and may not actually improve the resulting
   code’s performance.
&lt;/p&gt;
&lt;p&gt;
   The worst category of optimization is one that can lead to brittle and insecure code.
&lt;/p&gt;
&lt;p&gt;
   One example of this is heavy reliance on stack allocation. In C and C++, doing stack
   allocation of buffers often leads to difficult choices, like fixing the buffer size
   and writing to it in place. There is perhaps no single technique that, over the years,
   has led to the most buffer overrun-based exploits. Not only that, but stack overflow
   in Windows programs is quite catastrophic, and increases in likelihood the more stack
   allocation that a program does. So doing _alloca in C++ or stackalloc in C# is really
   just playing with fire, particularly for dynamically sized and potentially big allocations.
&lt;/p&gt;
&lt;p&gt;
   Another example is using unsafe code in C#. I can’t tell you how many times I’ve seen
   programmers employ unsafe pointer arithmetic to avoid the automatic bounds checking
   generated by the CLR JIT compiler. It is true that in some circumstances this can
   be a win. But it is also true that most programmers who do this never bothered to
   crack open the resulting assembly to see that the JIT compiler does a fairly decent
   job at automatic bounds check hoisting. This is an example where the cost of the optimization
   outweighs the benefits in most circumstances. The cost to pin memory, the risk of
   heap corruption due to a failure to properly pin memory or an offset error, and the
   complication in the code, are all just not worth it. Unless you really have actually
   measured and found the routine to be a problem.
&lt;/p&gt;
&lt;p&gt;
   If it smells way too complicated, it probably is.
&lt;/p&gt;
&lt;h1&gt;In conclusion
&lt;/h1&gt;
&lt;p&gt;
   I’m not saying Knuth didn’t have a good point. He did. But the "premature optimization
   is the root of all evil" pop-culture and witty statement is not a license to ignore
   performance altogether. It’s not justification to be sloppy about writing code. Proactive
   attention to performance is priceless, especially for certain kinds of product- and
   systems-level software.
&lt;/p&gt;
&lt;p&gt;
   My hope is that this article has helped to instill a better sense, or reinforce an
   existing good sense, for what matters, where it matters, and why it matters when it
   comes to performance. Before you write a line of code, you really need to have a budget
   for what you can afford; and, as you write your code, you must know what that code
   costs, and keep a constant mental tally of how much of that budget has been spent.
   Don’t exceed the budget, and most certainly do not ignore it and just hope that wishful
   thinking will save your behind. Building up this debt will cost you down the road,
   I promise. And ultimately, test-driven development works for performance too; you
   will at least know right away once you have exceeded your budget.
&lt;/p&gt;
&lt;p&gt;
   Think about worst case performance. It’s not the only thing that matters, but particularly
   when the best and worst case differ by an order of magnitude, you will probably need
   to think more holistically about the composition of caller and callee while building
   a larger program out of constituent parts.
&lt;/p&gt;
&lt;p&gt;
   And lastly, the productivity and safety gains of managed code, thanks to nice abstractions,
   type- and memory-safety, and automatic memory management, do not have to come at the
   expense of performance. Indeed this is a stereotype that performance conscious programmers
   are in a position to break down. All you need to do is slow down and be thoughtful
   about each and every line of code you write. Remember, programming is as much engineering
   as it is an art. Measure, measure, measure; and, of course, be thoughtful, intentional,
   and responsible in crafting beautiful and performant code.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=4db70333-295b-441f-80f9-21b90bd44287" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=7a57f623-d65d-4212-973d-29bdcf61dd3a</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,7a57f623-d65d-4212-973d-29bdcf61dd3a.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      That immutability facilitates increased degrees of concurrency is an oft-cited dictum.
      But is it true? And either way, why?
   </p>
        <p>
      My view on this matter may be a controversial one. Immutability is an important foundational
      tool in the toolkit for building concurrent – in addition to reliable and predictable
      – software. But it is not the only one that matters. Making all your data immutable
      isn’t going to instantly lead to a massively scalable program. Natural isolation is
      also critically important, perhaps more so. And, as it turns out, sometimes mutability
      is just what the doctor ordered, as with large-scale data parallelism.
   </p>
        <p>
        </p>
        <h1>Isolation first; immutability second; synchronization last
   </h1>
        <p>
      Stepping back for a moment, the recipe for concurrency is rather simple. Say you’ve
      got multiple concurrent pieces of work running simultaneously (or have a goal of getting
      there); for discussion’s sake, call them <em>tasks</em>. Take two tasks. The first
      critical decision has two cases: either these tasks concurrently access overlapping
      data in shared-memory, or they do not. If they do not, they are <em>isolated</em>,
      and no precautions associated with racing memory updates are needed. If they do share
      data, on the other hand, then something else must give. If all concurrently accessed
      data is <em>immutable</em>, or all functions used to interact with data are <em>pure</em>,
      then dangerous concurrency hazards are avoided. All is well. If some data is mutable,
      however, then this is where things get tricky, and higher-level <em>synchronization</em> is
      needed to make accesses safe. This decision tree is straightforward and clear.
   </p>
        <p>
      I have listed those four attributes – isolated, immutable and pure, and synchronization
      – in a very intentional order. Thankfully, this order mirrors the natural top-down
      hierarchical architecture of most modern object- and component-based programs: we
      have large containers that communicate through well-defined interfaces, each comprised
      of layers of such containers, and somewhere towards the leaves, a fair amount of intimate
      commingling of knowledge regarding data and invariants.
   </p>
        <p>
      This order also reflects the order of complexity and execution-time costs, from least
      to most. Isolation is simple, because components depend on each other in loosely-coupled
      ways, and in fact scales superiorly in a concurrent program because no synchronization
      is necessary, the “right” data structure may be chosen for the job – immutable or
      not – and locality is part-and-parcel to the architecture. Immutability at least avoids
      the morass of synchronization, which can affect programs immensely in complexity,
      runtime overheads, and write-contention for shared data. It is clear that synchronization
      is something to avoid at all costs, particularly anything done in an ad-hoc manner
      like locks.
   </p>
        <p>
        </p>
        <h1>Making the concurrency
   </h1>
        <p>
      But where did all this concurrency come from, anyway?
   </p>
        <p>
      It came from two things:
   </p>
        <ol>
          <li>
         The coarse-grained breakdown of a program into isolated pieces. 
      </li>
          <li>
         The fine-grained data parallelism. 
      </li>
        </ol>
        <p>
      On #1: Program fragments that are isolated are already half-way down the road to running
      concurrently as tasks. The second half of this journey, of course, is teaching them
      to interact with one another asynchronously, most frequently through message-passing
      or by sticking them into a pipeline. The details of course depend on what programming
      language you are using. It may be through agents, actors, active objects, COM objects,
      EJBs, CCR receivers, web-services, something ad-hoc built with .NET tasks, or some
      other reification. Nevertheless the isolation is common to all these.
   </p>
        <p>
      On #2: Data parallelism, it turns out, often works best with mutable data structures.
      These structures must be partitionable, of course, so that tasks comprising the data
      parallel operations may operate with logically isolated chunks of this data safely,
      even if they are parts of the same physical data structure. So chunks of them are
      isolated even though they don’t appear to be. This is trivially achievable with many
      important parallel-friendly data structures like arrays, vectors, and matrices. Capturing
      this isolation in the type system is of course no small task, though region typing
      gets close (see UIUC’s Data Parallel Java).
   </p>
        <p>
      But you usually don’t want these structures to be immutable, because they can be modified
      in constant-time and space if they are their classic simple mutable forms. Programmers
      doing HPC-style data-parallelism a la FORTRAN, vectorization, and GPGPU know this
      quite well. Compare this a world where we are doing data-parallelism over immutable
      data structures, where modifications often necessitate allocations or more complicated
      big-oh times due to clever techniques meant to avoid such allocations, as with persistent
      immutable data structures. This is likely less ideal. It is true that some data parallel
      operations are not in-place against mutable data – as with PLINQ – at which point
      purity, but not immutability, is key. The two are related but not identical: immutability
      pervades the construction of data structures, whereas purity pervades the construction
      of functions. But if you can get by with one copy of the data, why not do it? Particularly
      since most datasets amenable to parallel speedups are quite large.
   </p>
        <h1>Immutability: the bricks, not the mortar
   </h1>
        <p>
      Notice that the concurrency did not actually come from immutable data structures in
      either case, however. So what are they good for?
   </p>
        <p>
      One obvious use, which has little to do with concurrency, is to enforce characteristics
      of particular data structures in a program. A translation lookup table may not have
      been meant to be written to except for initialization time, and using an immutable
      data structure is a wonderful way to enforce this intent.
   </p>
        <p>
      What about concurrency? Immutable data structures facilitate sharing data amongst
      otherwise isolated tasks in an efficient zero-copy manner. No synchronization necessary.
      This is the real payoff.
   </p>
        <p>
      For example, say we’ve got a document-editor and would like to launch a background
      task that does spellchecking in parallel. How will the spellchecker concurrently access
      the document, given that the user may continue editing it simultaneously? Likely we
      will use an immutable data structure to hold some interesting document state, such
      as storing text in a piece-table. OneNote, Visual Studio, and many other document-editors
      use this technique. This is zero-cost snapshot isolation.
   </p>
        <p>
      Not having immutability in this particular scenario is immensely painful. Isolation
      won’t work very well. You could model the document as a task, and require the spellchecker
      to interact with it using messages. Chattiness would be a concern. And, worse, the
      spellchecker’s messages may now interleave with other messages, like a user editing
      the document. Those kinds of message-passing races are non-trivial to deal with. Synchronization
      won’t work well either. Clearly we don’t want to lock the user out of editing his
      or her document just because spellchecking is occurring. Such a boneheaded design
      is what leads to spinning donuts, bleached-white screens, and “(Not Responding)” title
      bars. But clearly we don’t want to acquire a lock and then make a full copy of the
      entire document. Perhaps we’d try to copy just what is visible on the screen. This
      is a dangerous game to play.
   </p>
        <p>
      Immutability does not solve all of the problems in this scenario, however. Snapshots
      of any kind lead to a subtle issue that is familiar to those with experience doing
      multimaster, in which multiple parties have conflicting views on what “the” data ought
      to be, and in which these views must be reconciled.
   </p>
        <p>
      In this particular case, the spellchecker sends the results back to the task which
      spawned it, and presumably owns the document, when it has finished checking some portion
      of the document. Because the spellchecker was working with an immutable snapshot,
      however, its answer may now be out-of-date. We have turned the need to deal with message-level
      interleaving – as described above – into the need to deal with all of the messages
      that may have interleaved within a window of time. This is where multimaster techniques,
      such as diffing and merging come into play. Other techniques can be used, of course,
      like cancelling and ignoring out-of-date results. But it is clear something intentional
      must be done.
   </p>
        <p>
        </p>
        <h1>In conclusion
   </h1>
        <p>
      It is safe to say that immutability facilitates important concurrent architectures
      and algorithms. It can really help big time, for sure. But it is clearly no panacea.
      Whether mutability or immutability is the right choice for a particular data structure
      in your program, as with all things, depends.
   </p>
        <p>
      It could be the case that choosing a piece-table for storing your text facilitates
      large-scales of concurrency in version two of your software application, but that
      in version one you have no use for it. Making that call ahead of time may pay in spades
      down the road, even if it comes at a marginal cost up-front. Or it could be that choosing
      an immutable data structure costs you in time and space, and you never end up exploiting
      the fact that you could have shared that particular structure in a zero-cost way across
      agents in your program.
   </p>
        <p>
      One thing’s for sure: I’m glad to be programming in languages like C#, F#, Clojure,
      and Scala, where I’ve got a choice.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=7a57f623-d65d-4212-973d-29bdcf61dd3a" />
      </body>
      <title>Thoughts on immutability and concurrency</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,7a57f623-d65d-4212-973d-29bdcf61dd3a.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2010/07/12/ThoughtsOnImmutabilityAndConcurrency.aspx</link>
      <pubDate>Mon, 12 Jul 2010 03:18:30 GMT</pubDate>
      <description>&lt;p&gt;
   That immutability facilitates increased degrees of concurrency is an oft-cited dictum.
   But is it true? And either way, why?
&lt;/p&gt;
&lt;p&gt;
   My view on this matter may be a controversial one. Immutability is an important foundational
   tool in the toolkit for building concurrent – in addition to reliable and predictable
   – software. But it is not the only one that matters. Making all your data immutable
   isn’t going to instantly lead to a massively scalable program. Natural isolation is
   also critically important, perhaps more so. And, as it turns out, sometimes mutability
   is just what the doctor ordered, as with large-scale data parallelism.
&lt;/p&gt;
&lt;p&gt;
&lt;h1&gt;Isolation first; immutability second; synchronization last
&lt;/h1&gt;
&gt;
&lt;p&gt;
   Stepping back for a moment, the recipe for concurrency is rather simple. Say you’ve
   got multiple concurrent pieces of work running simultaneously (or have a goal of getting
   there); for discussion’s sake, call them &lt;em&gt;tasks&lt;/em&gt;. Take two tasks. The first
   critical decision has two cases: either these tasks concurrently access overlapping
   data in shared-memory, or they do not. If they do not, they are &lt;em&gt;isolated&lt;/em&gt;,
   and no precautions associated with racing memory updates are needed. If they do share
   data, on the other hand, then something else must give. If all concurrently accessed
   data is &lt;em&gt;immutable&lt;/em&gt;, or all functions used to interact with data are &lt;em&gt;pure&lt;/em&gt;,
   then dangerous concurrency hazards are avoided. All is well. If some data is mutable,
   however, then this is where things get tricky, and higher-level &lt;em&gt;synchronization&lt;/em&gt; is
   needed to make accesses safe. This decision tree is straightforward and clear.
&lt;/p&gt;
&lt;p&gt;
   I have listed those four attributes – isolated, immutable and pure, and synchronization
   – in a very intentional order. Thankfully, this order mirrors the natural top-down
   hierarchical architecture of most modern object- and component-based programs: we
   have large containers that communicate through well-defined interfaces, each comprised
   of layers of such containers, and somewhere towards the leaves, a fair amount of intimate
   commingling of knowledge regarding data and invariants.
&lt;/p&gt;
&lt;p&gt;
   This order also reflects the order of complexity and execution-time costs, from least
   to most. Isolation is simple, because components depend on each other in loosely-coupled
   ways, and in fact scales superiorly in a concurrent program because no synchronization
   is necessary, the “right” data structure may be chosen for the job – immutable or
   not – and locality is part-and-parcel to the architecture. Immutability at least avoids
   the morass of synchronization, which can affect programs immensely in complexity,
   runtime overheads, and write-contention for shared data. It is clear that synchronization
   is something to avoid at all costs, particularly anything done in an ad-hoc manner
   like locks.
&lt;/p&gt;
&lt;p&gt;
&lt;h1&gt;Making the concurrency
&lt;/h1&gt;
&gt;
&lt;p&gt;
   But where did all this concurrency come from, anyway?
&lt;/p&gt;
&lt;p&gt;
   It came from two things:
&lt;/p&gt;
&lt;ol&gt;
   &lt;li&gt;
      The coarse-grained breakdown of a program into isolated pieces. 
   &lt;li&gt;
      The fine-grained data parallelism. 
&lt;/ol&gt;
&lt;p&gt;
   On #1: Program fragments that are isolated are already half-way down the road to running
   concurrently as tasks. The second half of this journey, of course, is teaching them
   to interact with one another asynchronously, most frequently through message-passing
   or by sticking them into a pipeline. The details of course depend on what programming
   language you are using. It may be through agents, actors, active objects, COM objects,
   EJBs, CCR receivers, web-services, something ad-hoc built with .NET tasks, or some
   other reification. Nevertheless the isolation is common to all these.
&lt;/p&gt;
&lt;p&gt;
   On #2: Data parallelism, it turns out, often works best with mutable data structures.
   These structures must be partitionable, of course, so that tasks comprising the data
   parallel operations may operate with logically isolated chunks of this data safely,
   even if they are parts of the same physical data structure. So chunks of them are
   isolated even though they don’t appear to be. This is trivially achievable with many
   important parallel-friendly data structures like arrays, vectors, and matrices. Capturing
   this isolation in the type system is of course no small task, though region typing
   gets close (see UIUC’s Data Parallel Java).
&lt;/p&gt;
&lt;p&gt;
   But you usually don’t want these structures to be immutable, because they can be modified
   in constant-time and space if they are their classic simple mutable forms. Programmers
   doing HPC-style data-parallelism a la FORTRAN, vectorization, and GPGPU know this
   quite well. Compare this a world where we are doing data-parallelism over immutable
   data structures, where modifications often necessitate allocations or more complicated
   big-oh times due to clever techniques meant to avoid such allocations, as with persistent
   immutable data structures. This is likely less ideal. It is true that some data parallel
   operations are not in-place against mutable data – as with PLINQ – at which point
   purity, but not immutability, is key. The two are related but not identical: immutability
   pervades the construction of data structures, whereas purity pervades the construction
   of functions. But if you can get by with one copy of the data, why not do it? Particularly
   since most datasets amenable to parallel speedups are quite large.
&lt;/p&gt;
&lt;h1&gt;Immutability: the bricks, not the mortar
&lt;/h1&gt;
&lt;p&gt;
   Notice that the concurrency did not actually come from immutable data structures in
   either case, however. So what are they good for?
&lt;/p&gt;
&lt;p&gt;
   One obvious use, which has little to do with concurrency, is to enforce characteristics
   of particular data structures in a program. A translation lookup table may not have
   been meant to be written to except for initialization time, and using an immutable
   data structure is a wonderful way to enforce this intent.
&lt;/p&gt;
&lt;p&gt;
   What about concurrency? Immutable data structures facilitate sharing data amongst
   otherwise isolated tasks in an efficient zero-copy manner. No synchronization necessary.
   This is the real payoff.
&lt;/p&gt;
&lt;p&gt;
   For example, say we’ve got a document-editor and would like to launch a background
   task that does spellchecking in parallel. How will the spellchecker concurrently access
   the document, given that the user may continue editing it simultaneously? Likely we
   will use an immutable data structure to hold some interesting document state, such
   as storing text in a piece-table. OneNote, Visual Studio, and many other document-editors
   use this technique. This is zero-cost snapshot isolation.
&lt;/p&gt;
&lt;p&gt;
   Not having immutability in this particular scenario is immensely painful. Isolation
   won’t work very well. You could model the document as a task, and require the spellchecker
   to interact with it using messages. Chattiness would be a concern. And, worse, the
   spellchecker’s messages may now interleave with other messages, like a user editing
   the document. Those kinds of message-passing races are non-trivial to deal with. Synchronization
   won’t work well either. Clearly we don’t want to lock the user out of editing his
   or her document just because spellchecking is occurring. Such a boneheaded design
   is what leads to spinning donuts, bleached-white screens, and “(Not Responding)” title
   bars. But clearly we don’t want to acquire a lock and then make a full copy of the
   entire document. Perhaps we’d try to copy just what is visible on the screen. This
   is a dangerous game to play.
&lt;/p&gt;
&lt;p&gt;
   Immutability does not solve all of the problems in this scenario, however. Snapshots
   of any kind lead to a subtle issue that is familiar to those with experience doing
   multimaster, in which multiple parties have conflicting views on what “the” data ought
   to be, and in which these views must be reconciled.
&lt;/p&gt;
&lt;p&gt;
   In this particular case, the spellchecker sends the results back to the task which
   spawned it, and presumably owns the document, when it has finished checking some portion
   of the document. Because the spellchecker was working with an immutable snapshot,
   however, its answer may now be out-of-date. We have turned the need to deal with message-level
   interleaving – as described above – into the need to deal with all of the messages
   that may have interleaved within a window of time. This is where multimaster techniques,
   such as diffing and merging come into play. Other techniques can be used, of course,
   like cancelling and ignoring out-of-date results. But it is clear something intentional
   must be done.
&lt;/p&gt;
&lt;p&gt;
&lt;h1&gt;In conclusion
&lt;/h1&gt;
&gt;
&lt;p&gt;
   It is safe to say that immutability facilitates important concurrent architectures
   and algorithms. It can really help big time, for sure. But it is clearly no panacea.
   Whether mutability or immutability is the right choice for a particular data structure
   in your program, as with all things, depends.
&lt;/p&gt;
&lt;p&gt;
   It could be the case that choosing a piece-table for storing your text facilitates
   large-scales of concurrency in version two of your software application, but that
   in version one you have no use for it. Making that call ahead of time may pay in spades
   down the road, even if it comes at a marginal cost up-front. Or it could be that choosing
   an immutable data structure costs you in time and space, and you never end up exploiting
   the fact that you could have shared that particular structure in a zero-cost way across
   agents in your program.
&lt;/p&gt;
&lt;p&gt;
   One thing’s for sure: I’m glad to be programming in languages like C#, F#, Clojure,
   and Scala, where I’ve got a choice.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=7a57f623-d65d-4212-973d-29bdcf61dd3a" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=4a49c0a0-3e4e-4d9f-b9de-599d89a94283</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,4a49c0a0-3e4e-4d9f-b9de-599d89a94283.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      In .NET today, readonly/initonly-ness is in the eye of the provider. Not the beholder. 
   </p>
        <p>
      Although both C# and the CLR verifier go to great pains to ensure you don't change
      a readonly/initonly field outside of its constructor (or class constructor, in the
      case of a static field), this guarantee doesn't imply what you might imagine. It means
      what it says: you can't change such fields except for in certain contexts. 
   </p>
        <p>
      If you try, C# won't let you, including forming byrefs to them: 
   </p>
        <p>
          <code>
            <blockquote>
              <pre>
v.cs(0,0): error CS0191: A readonly field cannot be assigned to (except in a constructor or a variable initializer)
v.cs(0,0): error CS0192: A readonly field cannot be passed ref or out (except in a constructor)
v.cs(0,0): error CS0198: A static readonly field cannot be assigned to (except in a static constructor or a variable initializer)
v.cs(0,0): error CS0199: A static readonly field cannot be passed ref or out (except in a static constructor)
</pre>
            </blockquote>
          </code>
        </p>
        <p>
        </p>
        <p>
      And neither will the CLR verifier: 
   </p>
        <p>
          <code>
            <blockquote>
              <pre>
[IL]: Error: [c:\v.exe : C::Main][offset 0x00000001] Cannot change initonly
    field outside its .ctor.
</pre>
            </blockquote>
          </code>
        </p>
        <p>
        </p>
        <p>
      Of course, attempting to invoke an operation on a readonly struct will make a defensive
      copy locally, and invoke the method against that. This ensures the readonly contents
      cannot change. 
   </p>
        <p>
      One unfortunate hole in this safety is with unions. You do not need unsafe code to
      break readonly, and yet the effect is the same as with an unverifiable program that
      writes to a readonly field: 
   </p>
        <p>
          <code>
            <blockquote>
              <pre>struct S1 {
    public readonly int X;
}

struct S2 {
    public int X;
}

[StructLayout(LayoutKind.Explicit)]
struct S3 {
    [FieldOffset(0)]
    public S1 A;
    [FieldOffset(0)]
    public S2 B;
}
</pre>
            </blockquote>
          </code>
        </p>
        <p>
        </p>
        <p>
      Now we can change A.X via B.X, even though A.X is supposedly readonly: 
   </p>
        <p>
          <code>
            <blockquote>
              <pre>S3 s3 = ...;
int x = s3.A.X;
s3.B.X++;
ASSERT(x == s3.A.X); // false; it is +1
</pre>
            </blockquote>
          </code>
        </p>
        <p>
        </p>
        <p>
      The same would have been true even if the field S3.A was marked readonly. 
   </p>
        <p>
      This is quite an evil trick. I have to be honest that I believe this is a CIL verification
      hole, and should produce unverifiable MSIL much like when you try to overlay structs
      containing overlapping GC references. Nevertheless, it is what it is. 
   </p>
        <p>
      Let's step back. Why does all of this matter, anyway, and what guarantees were we
      hoping that readonly would provide?
   </p>
        <p>
      It would be ideal, I assert, if the guarantee was not just "the target field can only
      be written to in the constructor", but also "the target field, once read, cannot be
      observed with a different value later on". This would not be true during construction,
      but we'd like to say it holds at all other times. 
   </p>
        <p>
      The above example throws a wrench in this idea. As does the following example. But
      this new example will be more disturbing, because the solution is not a simple verifier
      change. 
   </p>
        <p>
      What would you expect this program to print to the console? 
   </p>
        <p>
          <code>
            <blockquote>
              <pre>struct S {
    public readonly int X;

    public S(int x) { X = x; }

    public void MultiplyInto(int c, out S target) {
        System.Console.WriteLine(this.X);
        target = new S(X * c);
        System.Console.WriteLine(this.X); // same? it is, after all, readonly.
    }
}

S s = new S(42);
s.MultiplyInto(10, out s);
</pre>
            </blockquote>
          </code>
        </p>
        <p>
        </p>
        <p>
      As you may or may not have guessed, the output is "42" followed by "420". Yes, the
      value of 'this.X' changes after we have assigned to 'target' inside MultiplyTo, because
      the caller aliases the out-param with the 'this' param. Recall that parameter passing
      for structs in C# is done byref, so that these two references actually physically
      point to the same location when that call is made. The assignment to 'target', therefore,
      actually replaces the entire contents of 'this' all at once. And hence this gives
      the illusion that readonly fields are shifting. 
   </p>
        <p>
      You might be tempted to say that this can be prevented with alias analysis. But this
      is deceptively difficult to do. Consider this more complicated example: 
   </p>
        <p>
          <code>
            <blockquote>
              <pre>class C {
    public struct S S;
}

void M1(C c) {
    M2(c, out c.S);
}

void M2(C c, out S s) {
    c.S.MultiplyInto(10, out s);
}
</pre>
            </blockquote>
          </code>
        </p>
        <p>
        </p>
        <p>
      It is in no way clear inside M2 that the two aliases refer to the same location. The
      aliasing occurred higher up in the stack. Although byrefs are restricted to stack-only
      passing, making the necessary alias analysis tantalizingly close to attainable, it
      is nontrivial to say the least. Presumably we would have had to have blocked the forming
      of the byref within M1, rather than its use within M2. We could fall back to runtime
      checks, but that is also unfortunate for numerous reasons. 
   </p>
        <p>
      The moral of the story? Structs as containers of readonly values are not to be trusted,
      at least not for situations that call for bulletproof safety, such as caching values
      in the compiler rather than rereading them, because the fields are readonly. Although
      C# and the CLR do a good job at verifying readonly/initonly are done right at the
      initialization site, there are still places where these guarantees break down. Thankfully
      the byref aliasing problem does not threaten thread-safety, but the union problem
      does. And in conclusion, I do have to imagine all of this will get fixed somewhere
      down the road, it's just a matter of when and where. 
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=4a49c0a0-3e4e-4d9f-b9de-599d89a94283" />
      </body>
      <title>When is a readonly field not readonly?</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,4a49c0a0-3e4e-4d9f-b9de-599d89a94283.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2010/07/01/WhenIsAReadonlyFieldNotReadonly.aspx</link>
      <pubDate>Thu, 01 Jul 2010 19:41:20 GMT</pubDate>
      <description>&lt;p&gt;
   In .NET today, readonly/initonly-ness is in the eye of the provider. Not the beholder. 
&lt;/p&gt;
&lt;p&gt;
   Although both C# and the CLR verifier go to great pains to ensure you don't change
   a readonly/initonly field outside of its constructor (or class constructor, in the
   case of a static field), this guarantee doesn't imply what you might imagine. It means
   what it says: you can't change such fields except for in certain contexts. 
&lt;/p&gt;
&lt;p&gt;
   If you try, C# won't let you, including forming byrefs to them: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt;&lt;pre&gt;
v.cs(0,0): error CS0191: A readonly field cannot be assigned to (except in a constructor or a variable initializer)
v.cs(0,0): error CS0192: A readonly field cannot be passed ref or out (except in a constructor)
v.cs(0,0): error CS0198: A static readonly field cannot be assigned to (except in a static constructor or a variable initializer)
v.cs(0,0): error CS0199: A static readonly field cannot be passed ref or out (except in a static constructor)
&lt;/pre&gt;
   &lt;/blockquote&gt;&lt;/code&gt; 
&lt;p&gt;
&lt;/p&gt;
&lt;p&gt;
   And neither will the CLR verifier: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt;&lt;pre&gt;
[IL]: Error: [c:\v.exe : C::Main][offset 0x00000001] Cannot change initonly
    field outside its .ctor.
&lt;/pre&gt;
   &lt;/blockquote&gt;&lt;/code&gt; 
&lt;p&gt;
&lt;/p&gt;
&lt;p&gt;
   Of course, attempting to invoke an operation on a readonly struct will make a defensive
   copy locally, and invoke the method against that. This ensures the readonly contents
   cannot change. 
&lt;/p&gt;
&lt;p&gt;
   One unfortunate hole in this safety is with unions. You do not need unsafe code to
   break readonly, and yet the effect is the same as with an unverifiable program that
   writes to a readonly field: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt;&lt;pre&gt;struct S1 {
    public readonly int X;
}

struct S2 {
    public int X;
}

[StructLayout(LayoutKind.Explicit)]
struct S3 {
    [FieldOffset(0)]
    public S1 A;
    [FieldOffset(0)]
    public S2 B;
}
&lt;/pre&gt;
   &lt;/blockquote&gt;&lt;/code&gt; 
&lt;p&gt;
&lt;/p&gt;
&lt;p&gt;
   Now we can change A.X via B.X, even though A.X is supposedly readonly: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt;&lt;pre&gt;S3 s3 = ...;
int x = s3.A.X;
s3.B.X++;
ASSERT(x == s3.A.X); // false; it is +1
&lt;/pre&gt;
   &lt;/blockquote&gt;&lt;/code&gt; 
&lt;p&gt;
&lt;/p&gt;
&lt;p&gt;
   The same would have been true even if the field S3.A was marked readonly. 
&lt;/p&gt;
&lt;p&gt;
   This is quite an evil trick. I have to be honest that I believe this is a CIL verification
   hole, and should produce unverifiable MSIL much like when you try to overlay structs
   containing overlapping GC references. Nevertheless, it is what it is. 
&lt;/p&gt;
&lt;p&gt;
   Let's step back. Why does all of this matter, anyway, and what guarantees were we
   hoping that readonly would provide?
&lt;/p&gt;
&lt;p&gt;
   It would be ideal, I assert, if the guarantee was not just "the target field can only
   be written to in the constructor", but also "the target field, once read, cannot be
   observed with a different value later on". This would not be true during construction,
   but we'd like to say it holds at all other times. 
&lt;/p&gt;
&lt;p&gt;
   The above example throws a wrench in this idea. As does the following example. But
   this new example will be more disturbing, because the solution is not a simple verifier
   change. 
&lt;/p&gt;
&lt;p&gt;
   What would you expect this program to print to the console? 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt;&lt;pre&gt;struct S {
    public readonly int X;

    public S(int x) { X = x; }

    public void MultiplyInto(int c, out S target) {
        System.Console.WriteLine(this.X);
        target = new S(X * c);
        System.Console.WriteLine(this.X); // same? it is, after all, readonly.
    }
}

S s = new S(42);
s.MultiplyInto(10, out s);
&lt;/pre&gt;
   &lt;/blockquote&gt;&lt;/code&gt; 
&lt;p&gt;
&lt;/p&gt;
&lt;p&gt;
   As you may or may not have guessed, the output is "42" followed by "420". Yes, the
   value of 'this.X' changes after we have assigned to 'target' inside MultiplyTo, because
   the caller aliases the out-param with the 'this' param. Recall that parameter passing
   for structs in C# is done byref, so that these two references actually physically
   point to the same location when that call is made. The assignment to 'target', therefore,
   actually replaces the entire contents of 'this' all at once. And hence this gives
   the illusion that readonly fields are shifting. 
&lt;/p&gt;
&lt;p&gt;
   You might be tempted to say that this can be prevented with alias analysis. But this
   is deceptively difficult to do. Consider this more complicated example: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt;&lt;pre&gt;class C {
    public struct S S;
}

void M1(C c) {
    M2(c, out c.S);
}

void M2(C c, out S s) {
    c.S.MultiplyInto(10, out s);
}
&lt;/pre&gt;
   &lt;/blockquote&gt;&lt;/code&gt; 
&lt;p&gt;
&lt;/p&gt;
&lt;p&gt;
   It is in no way clear inside M2 that the two aliases refer to the same location. The
   aliasing occurred higher up in the stack. Although byrefs are restricted to stack-only
   passing, making the necessary alias analysis tantalizingly close to attainable, it
   is nontrivial to say the least. Presumably we would have had to have blocked the forming
   of the byref within M1, rather than its use within M2. We could fall back to runtime
   checks, but that is also unfortunate for numerous reasons. 
&lt;/p&gt;
&lt;p&gt;
   The moral of the story? Structs as containers of readonly values are not to be trusted,
   at least not for situations that call for bulletproof safety, such as caching values
   in the compiler rather than rereading them, because the fields are readonly. Although
   C# and the CLR do a good job at verifying readonly/initonly are done right at the
   initialization site, there are still places where these guarantees break down. Thankfully
   the byref aliasing problem does not threaten thread-safety, but the union problem
   does. And in conclusion, I do have to imagine all of this will get fixed somewhere
   down the road, it's just a matter of when and where. 
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=4a49c0a0-3e4e-4d9f-b9de-599d89a94283" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=1f0b032e-5871-4749-9c74-71fe205c9bc5</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,1f0b032e-5871-4749-9c74-71fe205c9bc5.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <title>On partially-constructed objects</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,1f0b032e-5871-4749-9c74-71fe205c9bc5.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2010/06/27/OnPartiallyconstructedObjects.aspx</link>
      <pubDate>Sun, 27 Jun 2010 19:52:29 GMT</pubDate>
      <description>&lt;p&gt;
   Partially-constructed objects are a constant source of difficulty in object-oriented
   systems. These are objects whose construction has not completed in its entirety prior
   to another piece of code trying to use them. Such uses are often error-prone, because
   the object in question is likely not in a consistent state. Because this situation
   is comparatively rare in the wild, however, most people (safely) ignore or remain
   ignorant of the problem. Until one day they get bitten. 
&lt;/p&gt;
&lt;p&gt;
   Not only are partially-constructed objects a source of consternation for everyday
   programmers, they are also a challenge for language designers wanting to provide guarantees
   around invariants, immutability and concurrency-safety, and non-nullability. We shall
   see examples below why this is true. The world would be better off if partially-constructed
   objects did not exist. Thankfully there is some interesting prior art that moves us
   in this direction from which to learn. 
&lt;/p&gt;
&lt;p&gt;
&lt;h1&gt;Seeing such a beast in the wild
&lt;/h1&gt;
&gt;
&lt;p&gt;
   In what situations might you see a partially-constructed object? There are two common
   ones in C++ and C#: 
&lt;/p&gt;
&lt;ul&gt;
   &lt;li&gt;
      ‘this’ is leaked out of a constructor to some code that assumes the object has been
      initialized.&lt;/li&gt;
   &lt;li&gt;
      A failure partway through an object’s construction leads to its destructor or finalizer
      running against a partially-constructed object.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
   In the first case, the rule of thumb is “don’t do that.” This is easier said than
   done. The second case, on the other hand, is a fact of life, and the rule of thumb
   is “tread with care, and be intentional.” Let’s examine both more closely. 
&lt;/p&gt;
&lt;p&gt;
&lt;h2&gt;The evils of leaking ‘this’
&lt;/h2&gt;
&gt;
&lt;p&gt;
   Leaking ‘this’ during construction to code that expects to see a fully-initialized
   object is a terrible practice. Before moving on, it’s important to remember initialization
   order in C++ and C#: base constructors run first, and then more derived constructors.
   If I have E subclasses D subclasses C, then constructing an instance of E will run
   C’s constructor, and then D’s, and then lastly E’s. Destructors in C++, of course,
   run in the reverse order. 
&lt;/p&gt;
&lt;p&gt;
   Member initializers, on the other hand, run in different orders in C++ versus C#.
   In C#, they run from most derived first, to least derived. So in the above situation,
   E’s initializers run, and then D’s, and then C’s. This happens fully before running
   the ad-hoc constructor code. In C++, however, member initializers run alongside the
   ordinary construction process. C’s member initializers run just before C’s ad-hoc
   construction code, and then D’s, and then E’s. Another difference is that C#’s initializers
   cannot access ‘this’, whereas C++’s initializers can. 
&lt;/p&gt;
&lt;p&gt;
   For example, this C# program will print E_init, D_init, C_init, C_ctor, D_ctor, and
   then E_ctor: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt; &lt;pre&gt;
using System;
class C {
    int x = M();
    public C() {
        Console.WriteLine("C_ctor");
    }
    private static int M() {
        Console.WriteLine("C_init");
        return 42;
    }
}

class D : C {
    int x = M();
    public D() : base() {
        Console.WriteLine("D_ctor");
    }
    private static int M() {
        Console.WriteLine("D_init");
        return 42;
    }
}

class E : D {
    int x = M();
    public E() : base() {
        Console.WriteLine("E_ctor");
    }
    private static int M() {
        Console.WriteLine("E_init");
        return 42;
    }
}

class Program {
    public static void Main() {
        new E();
    }
}
&lt;/pre&gt;
   &lt;/blockquote&gt; &lt;/code&gt; 
&lt;/p&gt;
&lt;p&gt;
   And this C++ program will print C_init, C_ctor, D_init, D_ctor, E_init, E_ctor, ~E,
   ~D, and finally ~C: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt; &lt;pre&gt;
#include 
   &lt;iostream&gt;&lt;&lt; "C_ctor" &lt;&lt; endl;   }
    ~C() { cout &lt;&lt; "~C" &lt;&lt; endl; }
    static int M() { cout &lt;&lt; "C_init" &lt;&lt; endl; return 42; }
};

struct D : C {
   int x;
    D(): x(M()) { cout &lt;&lt; "D_ctor" &lt;&lt; endl; }
    ~D() { cout &lt;&lt; "~D" &lt;&lt; endl; }
    static int M() { cout &lt;&lt; "D_init" &lt;&lt; endl; return 42; }
};

struct E : D {
   int x;
    E() : x(M()) { cout &lt;&lt; "E_ctor" &lt;&lt; endl; }
    ~E() { cout &lt;&lt; "~E" &lt;&lt; endl; }
    static int M() { cout &lt;&lt; "E_init" &lt;&lt; endl; return 42; }
};

static void main() {
    E e;
}
&lt;/pre&gt;
      using namespace std; struct C { int x; C() : x(M()) { cout 
   &lt;/blockquote&gt; &lt;/code&gt; 
&lt;/p&gt;
&lt;p&gt;
   It’s interesting to note that the CLR allows constructor chaining to happen in any
   order. The C# compiler emits the calls to base as the first thing a constructor does,
   but other languages can choose to do differently. The verifier ensures that a call
   has occurred somewhere in the constructor body before returning. 
&lt;/p&gt;
&lt;p&gt;
   This IL program, for example, will print E, D, and then C – the reverse of what C#
   gives us: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt; &lt;pre&gt;
.assembly extern mscorlib { }
.assembly ctor { }

.class C {
  .method public specialname rtspecialname instance void .ctor() cil managed {
    ldstr      "C"
    call       void [mscorlib]System.Console::WriteLine(string)
    ldarg.0
    call       instance void [mscorlib]System.Object::.ctor()
    ret
  }
}

.class D extends C {
  .method public specialname rtspecialname instance void .ctor() cil managed {
    ldstr      "D"
    call       void [mscorlib]System.Console::WriteLine(string)
    ldarg.0
    call       instance void C::.ctor()
    ret
  }
}

.class E extends D {
  .method public specialname rtspecialname instance void .ctor() cil managed {
    ldstr      "E"
    call       void [mscorlib]System.Console::WriteLine(string)
    ldarg.0
    call       instance void D::.ctor()
    ret
  }
}

.class Program {
  .method public static void Main() cil managed {
    .entrypoint
    newobj     instance void E::.ctor()
    pop
    ret
  }
}
&lt;/pre&gt;
   &lt;/blockquote&gt; &lt;/code&gt; 
&lt;/p&gt;
&lt;p&gt;
   So why is leaking ‘this’ bad, anyway? 
&lt;/p&gt;
&lt;p&gt;
   Say you’ve decided in the implementation of D’s constructor that you would like to
   stick ‘this’ into a global hash-map. Doing this sadly means other code could grab
   the pointer and begin accessing it before E’s constructor has even run. That is a
   race at-best and a ticking time-bomb in all likelihood. For example: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt; &lt;pre&gt;&lt;int, C&gt;
class C {
    public static Dictionary s_globalLookup;
    private readonly int m_key;
    public C(int key) {
        m_key = key;
        s_globalLookup.Add(key, this);
    }
}
&lt;/pre&gt;
   &lt;/blockquote&gt; &lt;/code&gt; 
&lt;/p&gt;
&lt;p&gt;
   Even though we have taken great care to initialize our readonly field m_key before
   sticking ‘this’ into a dictionary, any subclasses will not have been initialized at
   this point. Only if C is sealed can we be assured of this. Another piece of code that
   grabs the element out of the hashtable and begins calling virtual methods on it, for
   example, is in a race with the completion of the initialization code for subclasses. 
&lt;/p&gt;
&lt;p&gt;
   Leaking ‘this’ isn’t always such an explicit act. Merely invoking a virtual method
   within the constructor may dispatch a more derived class’s override before the more
   derived class’s constructor has run. And therefore its state is most likely not in
   a place conducive to correct execution of that override. It is fairly common knowledge
   that invoking virtual methods during construction is an extraordinarily poor practice,
   and best avoided. 
&lt;/p&gt;
&lt;p&gt;
&lt;h2&gt;Failure to construct
&lt;/h2&gt;
&gt;
&lt;p&gt;
   Let’s move on to the second issue. Imagine we suffer an exception during construction
   of an object. Perhaps this is due to a failure to allocate resources, or maybe even
   due to argument validation. It is clear that a leaked ‘this’ object in such cases
   will be a problem, because the object will have escaped into the wild even though
   its initialization failed. Subsequent attempts to use the object will obviously pose
   problems. What is more subtle is that if the class in question declares a destructor
   (in C++) or finalizer (in C#), a problem may now be lurking. 
&lt;/p&gt;
&lt;p&gt;
   Let’s say we have the original situation shown above: C derives from D derives from
   E. Now say an exception happens within D’s constructor. At this point in time, C’s
   constructor has run to completion, D’s constructor has run partially up to the point
   of failure, and E’s has not run at all. (And, of course, in the case of C#, all member-initializers
   for all classes have actually run.) What happens to the cleanup code? 
&lt;/p&gt;
&lt;p&gt;
   In C++, only constructors that have run will have their corresponding destructors
   executed. In the above situation, where C, D, and E each declares a destructor, only
   C’s will be run during stack unwind. It is imperative, therefore, that D handles failure
   within its constructor rather than relying on the destructor. For example: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt; &lt;pre&gt;
class D : C {
    int* m_pBuf1;
    int* m_pBuf2;
public:
    D() {
        m_pBuf1 = … alloc …;
        m_pBuf2 = … alloc …;
    }
    ~D() {
        if (m_pBuf2) … free …;
        if (m_pBuf1) … free …;
    }
}
&lt;/pre&gt;
   &lt;/blockquote&gt; &lt;/code&gt; 
&lt;/p&gt;
&lt;p&gt;
   If the allocation destined for m_pBuf2 fails by throwing an exception, the destructor
   for D will not run, and therefore m_pBuf1 will leak. The C++ solution to this particular
   example is to use smart pointers and member initializers for the allocations, because
   successfully initialized members do get destructed, even when the constructor (or
   indeed one of the member initializers) subsequently fails. This means that destructors
   for a particular class do not have to tolerate that class’s state not having been
   fully constructed, because those destructors will never run. Destructors must not,
   however, invoke virtuals, for two obvious reasons: (a) subclasses may not have ever
   been initialized, and (b) destructors run in reverse order, and so the destruction
   code for the subclass will have already run by the time a baseclass’s destructor runs. 
&lt;/p&gt;
&lt;p&gt;
   In C#, finalizers will run, regardless of whether an object’s constructor ran fully,
   partway through, or not at all. If the object is allocated – and so long as GC.SuppressFinalize
   isn’t called on it – the finalizer runs. This distinctly means that C# finalizers
   must always be resilient to partially-constructed objects (unlike C++ destructors).
   Thankfully finalizers are a rare bird, and therefore this issue is seldom even noticed
   by .NET programmers. 
&lt;/p&gt;
&lt;p&gt;
   This issue does not arise in the case of .NET’s IDisposable pattern. If a constructor
   throws, the assignment to the target local variable does not occur. And therefore
   the variable enclosed in, say, a using statement remains null. This means that there
   is no way to possibly invoke Dispose on the object. Moreover, the allocation in using
   occurs prior to entering the try/finally block. Hence, you really had better be writing
   constructors that don’t throw and/or protecting such resources with smart-pointer-like
   things with finalizers, a la SafeHandle. 
&lt;/p&gt;
&lt;p&gt;
&lt;h1&gt;Impediments to language support
&lt;/h1&gt;
&gt;
&lt;p&gt;
   As if these weren’t sufficient cause for concern, I also mentioned earlier – and somewhat
   vaguely – that partially-constructed objects interfere with language designers’ efforts
   to add invariants, immutability and concurrency-safety, and non-nullability to the
   language. And all of these are important properties to consider in our present age
   of complexity and concurrency, so this point is worth understanding more deeply. Let’s
   look at each in-order. 
&lt;/p&gt;
&lt;p&gt;
&lt;h2&gt;Invariants and safe-points
&lt;/h2&gt;
&gt;
A partially-constructed object obviously may have broken invariants. By definition,
invariants are meant to hold at the end of construction, and so if construction never
completes, the rules of engagement are being broken. &gt;
&lt;p&gt;
   Imagine, for example: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt; &lt;pre&gt;&lt; m_y;
    public C(int a) {
        m_x = a;
        m_y = a + 1;
    }
}
&lt;/pre&gt;
class C {
    int m_x;
    int m_y;
    invariant m_x 

   &lt;/blockquote&gt; &lt;/code&gt; 
&lt;/p&gt;
&lt;p&gt;
   It is ordinarily very difficult to ensure that each instruction atomically transitions
   the state of an object from one invariant safe-point to another. A common technique
   is to define well-defined points at which invariants must hold. We might model each
   failure point as one such technique. But even in C#, the above program does not satisfy
   this constraint, because an overflow exception may be generated at the ‘m_y = a +
   1’ line. Or a thread-abort exception may be generated right in the middle of those
   two functions. Or, if addition were implemented as a JIT helper, an out-of-memory
   exception could get generated due to failure to JIT the helper function. 
&lt;/p&gt;
&lt;p&gt;
   In such cases, we’d like to say that the object does not exist. But the sad fact is
   that the object *does* exist, and if the object has acquired physical resources at
   the time of failure to construct, we must compensate and reclaim them. The ideal world
   looks a lot like object construction as transactions, where the end of construction
   is the commit-point. The state-of-the-art is very different from this, however, and
   so any static verification and theorem proving that depends on invariants on an object
   holding, well, invariantly, is subject to being broken by partially-constructed objects. 
&lt;/p&gt;
&lt;p&gt;
&lt;h2&gt;Immutability… or not
&lt;/h2&gt;
&gt;
&lt;p&gt;
   Immutability is also threatened by partially-constructed objects. Immutability is
   a one of many first class techniques for solving concurrency-safety in the language,
   so this one is quite unfortunate. 
&lt;/p&gt;
&lt;p&gt;
   In C#, for example, we might be tempted to say that a shallowly immutable type is
   one whose fields are all marked readonly. (And a deeply immutable type is one whose
   fields are all readonly, and also refer to types that are immutable.) A readonly field
   cannot be written to after construction has completed. Unfortunately, if the ‘this’
   leaks out during construction, we may see those readonly fields in an uninitialized
   or even changing state: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt; &lt;pre&gt;
class C {
    public static C s_c;
    readonly int m_x;
    public C() {
        m_x = 42;
        s_c = this;
        while (true) {
            ++m_x;
        }
    }
}
&lt;/pre&gt;
   &lt;/blockquote&gt; &lt;/code&gt; 
&lt;/p&gt;
&lt;p&gt;
   This is quite evidently a terrible and malicious program. C appears to be immutable,
   because it only contains readonly fields, but is quite clearly not, because the value
   of m_x is assigned to multiple times. If we had a guarantee that all readonly fields
   were definitely assigned once-and-only-once before ‘this’ can escape, then we’d be
   close to a solution. But of course we have no such guarantee. In C#, at least. 
&lt;/p&gt;
&lt;p&gt;
   A related issue is co-initialization of objects. This is interesting and relevant,
   because in such cases we actually want to leak out partially-constructed objects.
   Imagine we’d like to build a cyclic graph comprised of two nodes, A and B, each referring
   to the other. With a naïve approach to immutability, we simply cannot make this work.
   Either A must first refer to B, in which case A refers to a partially-constructed
   B; or B must first refer to A, in which case B refers to a partially-constructed A.
   Both are equally as bad. The two assignments are not atomic. 
&lt;/p&gt;
&lt;p&gt;
   Cyclic data structures are a commonly cited weakness of immutability, and an argument
   in favor of supporting partially-instantiated objects in a first class way, although
   there are approaches that can work. One example is to separate edges from nodes, and
   represent them with different data structures. We can then build the nodes A and B,
   and then build the edges A-&gt;B and B-&gt;A without needing to use cycles. 
&lt;/p&gt;
&lt;p&gt;
&lt;h2&gt;It’s not-null, or at least it wasn’t supposed to be
&lt;/h2&gt;
&gt;
&lt;p&gt;
   Tony Hoare called it his billion-dollar mistake: the introduction of nulls into a
   programming language. I think he sells himself short, however, because the absence
   of nulls in an imperative programming language – however worthy a pursuit – is actually
   a notoriously difficult to attain. 
&lt;/p&gt;
&lt;p&gt;
   Spec# is one example of a C-style language with non-nullability, in which T! represents
   a “non-null T”, for any T. Although this is done in a pleasant way conducive to C#-compatibility
   – a significant goal of Spec# -- I’d personally prefer to see the polarity switched:
   T would mean “non-null T” and T? would mean “nullable T”, for any T, reference- and
   value-types included. This is much more like Haskell’s Maybe monad, and is the syntax
   I’ll use below for illustration purposes. But I digress. 
&lt;/p&gt;
&lt;p&gt;
   Non-nullability is a wonderful invention, because it is common for 75% or more of
   the contracts and assertions in modern programs to be about a pointer being non-null
   prior to dereferencing it, both in C# and in C++. Relying on the type-system to prove
   the absence of nulls is one big step towards creating programs that are robust and
   correct out-of-the-gate, particularly for systems software where such degrees of reliability
   are important. And it cuts down on all those boilerplate contracts sprinkled throughout
   a program. Instead of: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt; &lt;pre&gt;
void M(C c, D d, E e)
    requires c != null, d != null, e != null
{
    … use(c, d, e) …
}
&lt;/pre&gt;
   &lt;/blockquote&gt; &lt;/code&gt; 
&lt;/p&gt;
&lt;p&gt;
   You simply say: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt; &lt;pre&gt;
void M(C c, D d, E e)
{
    … use (c, d, e) …
}
&lt;/pre&gt;
   &lt;/blockquote&gt; &lt;/code&gt; 
&lt;/p&gt;
&lt;p&gt;
   No opportunity to miss one, and no need for runtime checks. It’s beautiful. 
&lt;/p&gt;
&lt;p&gt;
   A problem quickly arises, however, having to do with partially-constructed objects.
   All of an object’s fields cannot possibly be non-null while the constructor is executing,
   because the object has been zero’d out and the assignments to its fields have not
   yet been made. Clearly constructor code needs to be treated “differently” somehow.
   We cannot simply live with the fact that ‘this’ escaping leads to a partially-constructed
   object leaking out into the program, because that could lead to serious errors. These
   serious errors include potential security holes, if unsafe code is manipulating the
   supposedly non-null pointer. One advantage to adding non-nullability is that runtime
   null checks can be omitted, because the type system guarantees the absence of nulls
   in certain places. In this situation, partially-constructed objects lead to holes
   in our nice type system support. Either the dynamic non-null checks are required as
   back-stop, or we’ll need some other coping technique. 
&lt;/p&gt;
&lt;p&gt;
   There are related issues with non-nullability, like with partially-populated arrays.
   Imagine we’d like to allocate an array of 1M elements of type T, and we will proceed
   to populate those elements following the array’s allocation. There’s clearly a window
   of time during which the array contains 1M nulls, and then 1M-1 nulls, …, and if we
   finish 1M-1M nulls, i.e. 0 nulls. It is only at that last transition that the array
   can be considered to contain non-null T’s. The standard technique is to use an explicit
   dynamic conversion check, or to force the creation of the list to supply all of the
   elements of the array at construction time. 
&lt;/p&gt;
&lt;p&gt;
&lt;h1&gt;Coping techniques
&lt;/h1&gt;
&gt;
&lt;p&gt;
   There are, thankfully, some interesting ways to cope with partially-constructed objects.
   There is, in fact, a spectrum of techniques, ranging from escape analysis in various
   forms, to limitations around how objects are constructed such that a partially-constructed
   one can never leak, to automatic insertion of dynamic checks to prevent the use of
   partially-constructed objects, to static annotations that treat partially-constructed
   instances as first class things in the type system. And of course there’s always the
   technique of “deal with it”, which is the one that most C++-style languages have chosen,
   including our beloved C#. 
&lt;/p&gt;
&lt;p&gt;
&lt;h2&gt;The F# approach: restrictions and dynamic checks
&lt;/h2&gt;
&gt;
&lt;p&gt;
   F#, it turns out, does a very good job in preventing partially-initialized objects.
   A first important step is that fields in F# are readonly by-default, unless you opt-in
   to mutability using the mutable keyword. Therefore data structures are mostly immutable.
   And the construction rules are meant to make it very unlikely that you’ll expose a
   partially-constructed object during construction. How so? It’s simple: such fields
   must be initialized prior to running ad-hoc construction code, and if you attempt
   to initialize them multiple times, the compiler supplies an error. In other words,
   you really have to work hard to screw yourself, unlike C++ and C# where it’s very
   easy. 
&lt;/p&gt;
&lt;p&gt;
   It’s of course possible to do some dirty tricks to publish or access a partially-initialized
   object, despite needing to work very hard at it. There is, however, a nice surprise
   awaiting us when we try. For example: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt; &lt;pre&gt;
type C() =
    abstract member virt : unit -&gt; unit
    default this.virt() = ()

type D() as this =
    inherit C()
    do this.virt()

type E =
    inherit D
    val x : int
    new() = { x = 42; }
    override this.virt() = printf "x: %d" this.x

let e = new E()
&lt;/pre&gt;
   &lt;/blockquote&gt; &lt;/code&gt; 
&lt;/p&gt;
&lt;p&gt;
   This example attempts to perform a virtual invocation from C before the more derived
   class has been fully initialized. This overridden virtual simply (attempts to) prints
   out the value of x. If we compile and run this program, however, we will notice that
   we get an exception: “InvalidOperationException: the initialization of an object or
   value resulted in an object or value being accessed recursively before it was fully
   initialized.” 
&lt;/p&gt;
&lt;p&gt;
   Pretty neat. The compiler will stick in the checks necessary when ‘this’ is being
   accessed, to dynamically verify that an object is not being leaked before having been
   fully initialized. The F# approach can be summed up as trying to make things airtight
   as possible statically at compile-time, but admitting that there are holes – primarily
   due to inheritance – and dealing with them by inserting dynamic runtime checks. This
   tradeoff clearly makes sense for F#, because it is attempting to attain a robust level
   of reliability around immutability. 
&lt;/p&gt;
&lt;p&gt;
   F# also adds non-nullability for the most part. Like Haskell’s Maybe monad, F# adds
   an option type that can store a single None value which lies outside of the underlying
   type’s domain to effectively represent null. Because F# is a .NET language it of course
   also needs to worry about nulls at interop boundaries with other languages like C#
   and VB. This is a great step forward; first class CLI support would be a nice next
   step. 
&lt;/p&gt;
&lt;p&gt;
   A slight variant of the F# idea is to initialize data up the whole class hierarchy
   in one pass, and then run ad-hoc construction code in a second pass in the usual way.
   So long as readonly data can be initialized without running the ad-hoc construction
   code, this helps to statistically cut down on the chances for accidental leaking of
   ‘this’. 
&lt;/p&gt;
&lt;p&gt;
&lt;h2&gt;Type system: T versus notconstructed-T
&lt;/h2&gt;
&gt;
&lt;p&gt;
   We can model two kinds of T in the type-system: T and notconstructed-T. The constructor
   for any type T would then see the ‘this’ pointer as an notconstructed-T, and everything
   else – by default – sees ordinary T’s. 
&lt;/p&gt;
&lt;p&gt;
   What good does this distinction do? It enables us to add verification and restrictions
   around the use of notconstructed-T’s and limitations to the conversion between the
   two types. See &lt;a href="http://portal.acm.org/citation.cfm?id=949305.949332"&gt;this
   paper by Manuel Fahndrich and Rustan Leino&lt;/a&gt; for an example of how this approach
   was taken in Spec#’s non-nullability work. 
&lt;/p&gt;
&lt;p&gt;
   For example, we can prohibit conversion between T and notconstructed-T altogether,
   thereby disallowing escaping ‘this’ references altogether. If the type of ‘this’ within
   a constructor is different than all other references to type T, and they are not convertible,
   we’ve successfully walled the problem off in the type system. This protects against
   erroneous method calls, so that a constructor could not call any of its own methods,
   because these methods expect an ordinary T whereas the constructor only has a notconstructed-T.
   And because you cannot state notconstructed-T in the language, you cannot let one
   leak by storing it into a field. 
&lt;/p&gt;
&lt;p&gt;
   We could add more sophisticated support, by allowing a programmer to explicitly tag
   certain non-field references as T-notconstructed. This makes the concept first-class
   in the language, and allows one to explicitly declare the fact that code is interacting
   with a partially-constructed T: 
&lt;/p&gt;
&lt;p&gt;
   &lt;code&gt; &lt;blockquote&gt; &lt;pre&gt;
class C
{
    int m_x;
    public C() {
        m_x = V();
    }
    protected virtual int V() notconstructed {
        … I know to be careful …
    }
}
&lt;/pre&gt;
   &lt;/blockquote&gt; &lt;/code&gt; 
&lt;/p&gt;
&lt;p&gt;
   In this example, the programmer has annotated V with ‘notconstructed’. This enables
   the call from the constructor because the method’s ‘this’ is an uninitialized-T, and
   serves as a warning to the programmer that he or she should take care, much like the
   code written inside a constructor. 
&lt;/p&gt;
&lt;p&gt;
   We must also decide whether fields are offlimits via notconstructed-T’s. If yes, we
   can add F#-style dynamic checks for initialization, but only for attempted accesses
   against notconstructed object’s fields. This is nice because it means the scope of
   dynamic checks are limited, and used in a pay-for-play manner. And we could even enable
   a programmer to sidestep the error by stating at the use-site that they understand
   a particular field access may be to uninitialized memory, like Field.ReadMaybeUninitialized(&amp;m_field). 
&lt;/p&gt;
&lt;p&gt;
   To be honest, the reason this approach has likely not yet seen widespread use is that
   the cost is not commensurate with the benefit. At least, in my opinion. To make something
   like partially-constructed objects a first-class concept in a programming language,
   programmers would need to want to know where they are dealing with them. For systems
   programmers, this makes sense. For many other programmers, it would be useless ceremony
   with no perceived value. And yet the initial approach where nothing new needed to
   be stated, but yet escaping ‘this’ was prevented, blocks certain patterns of legal
   use. This is where theory and practice run up against one another. There is, however,
   presumably a nice middle ground awaiting discovery. 
&lt;/p&gt;
&lt;p&gt;
&lt;h1&gt;Winding down
&lt;/h1&gt;
&gt;
&lt;p&gt;
   I meant for this to be a short post. But the topic really is fascinating, and has
   been coming up time and time again as we do language work here at Microsoft. But it
   is truly fascinating mainly because, like nulls, the problem is widespread yet tolerable,
   and most C++ and C# programmers learn the rules and make do. Partially-constructed
   objects are a major blemish, but not a crisis that threatens the complete abandonment
   of imperative programming. 
&lt;/p&gt;
&lt;p&gt;
   I do believe language trends indicate that more will move away from C++-style object
   initialization and related issues, and more towards immutability and treating data
   and its initialization separately from imperative ad-hoc initialization code. Haskell,
   F#, and Clojure, for example, show us some promising paths forward. There are a plethora
   of other attempts at solving related problems, and I unfortunately could only scratch
   the surface. 
&lt;/p&gt;
&lt;p&gt;
   Although these techniques are not new, the primary question – to me – is how close
   to “the metal” in systems programming these concepts can be made to work. Typically
   for those situations, you need to rely on a mixture of static verification and complete
   freedom, because the dynamic checking is too costly and the code to work around overly-limiting
   static verification also adds too much cost. But as soon as you add complete freedom
   into the picture, you run into partially-constructed objects as a consequence, and
   all the issues I’ve mentioned above. 
&lt;/p&gt;
&lt;p&gt;
   Anyway, I hope it was interesting. 
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=1f0b032e-5871-4749-9c74-71fe205c9bc5" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=94778838-3e27-4788-a935-006ae9c8ef94</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,94778838-3e27-4788-a935-006ae9c8ef94.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
          <a href="http://www.bluebytesoftware.com/blog/2010/01/03/ABriefRetrospectiveOnTransactionalMemory.aspx">My
      article about Transactional Memory</a> (TM) was picked up by a few news feeds recently.
   </p>
        <p>
      If I had known this would occur, I would have written it with greater precision. 
      Because my article presents a mixture of technical challenges interspersed among more
      subjective and cultural issues, I am sure it is difficult to tease out my intended
      conclusion.  To summarize, I merely believe adding TM to a shared memory
      architecture alone is insufficient.
   </p>
        <p>
      Indeed, I remain a big fan of transactions.  Atomicity, consistency, and isolation,
      and coming up with strategies for achieving all three in tandem, are part and parcel
      to architecting software.
   </p>
        <p>
      After <a href="http://www.infoq.com/presentations/liskov-power-of-abstraction">watching
      Barbara Liskov's OOPSLA Turing Award reprise</a>, I decided to reacquaint myself with <a href="http://portal.acm.org/citation.cfm?id=42399">some
      old Argus papers</a> this weekend.  It has been some time since I last read them. 
      Argus was Liskov's language for distributed programming and her follow-on to CLU. 
      As with most research done by brilliant people, the work was way ahead of its time,
      has appeared in ad-hoc incarnations and permutations over time, and remains relevant
      today.  This research is particularly interesting to work that my team is doing
      right now, especially its notion of guardians.  And it is relevant to the TM
      discussion too.
   </p>
        <p>
      The Argus approach of using isolation to coarsely partition state and operations into
      independent bubbles, and then communicating asynchronously between the so-called
      guardians that are responsible for this state, is an architecture that is common
      among most highly concurrent programs.  This aids state management and fault
      tolerance.  Argus makes an interesting observation that, although guardians may
      be sent messages concurrently -- and indeed activities themselves may even introduce
      local concurrency -- manipulation of state can be done safely and even in parallel
      thanks to transactions.
   </p>
        <p>
      The requirement is that messages are atomic and commute.  Transactions, it turns
      out, are a convenient way of implementing this requirement.
   </p>
        <p>
      You will observe a similar architecture in other places, including in some languages
      that have adopted TM.  Haskell has moved in this direction.  Everything
      is purely functional and so, of course, no state is mutated in an unsafe way by default. 
      However, with the introduction of concucrrency comes mutable cells for message passing
      and with parallelism comes indeterminism.  You can push the state management
      problem up indefinitely, but at the top there are almost always mutable operations
      on real-world state (even if it is "just I/O").  Haskell programs have a safe
      architecture to begin with, and it is the intentional and careful addition of specific
      facilities that forces one to focus on the problematic seams.  One could say
      that Haskell starts clean and stays clean, versus most shared memory-based languages
      which start dirty and try to attain cleanliness (at least when it comes to concurrency).
   </p>
        <p>
      Why aren't transactions sufficient, then, given that the I in ACID stands for Isolation? 
      You wouldn't model a database as one flat table in which each row is a single byte,
      however, would you?  As you begin to decompose your program into isolated state,
      your bubbles (or guardians) are the tables, and your objects are the rows.  This
      is just an analogy but I find it useful to think in these terms.  Taking a bunch
      of intermingled state and pouring transactions on top is not going to give you this
      nicely partitioned separation of state which has proven to be the lifeblood of concurrency.
   </p>
        <p>
      Even once you've attained a more isolated architecture, however, transactions are
      not a panacea.  They are just one of many viable state management techniques
      in a programmer's arsenal, hierarchical state machines being another notable example. 
      And in fact, many of the problems I mentioned in the TM article are still worrisome
      even when you start from the right place.  From within a guardian, you may wish
      to enlist the aid of another unrelated guardian to perform a coordinated atomic activity,
      because a higher level invariant relationship between them must be preserved. 
      Or an application which composes multiple guardians may wish to do so atomically. 
      Even Argus required manual compensation for such things.  This can be solved
      in part by DTC.  But experience suggests that continuing to push the enlistment
      scope one level higher eventually leads to substantial problems.  A topic for
      another day, I suppose.
   </p>
        <p>
      My primary conclusion is that TM is a great complement to highly concurrent programs,
      but only so long as you start from the right place.  The Argus and Haskell approaches
      are conducive to large-scale concurrency, but it is primarily because of the natural
      isolation those models provide; the addition of transactions address problems that
      remain after taking that step.  But without that first step, they would have
      gotten nowhere.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=94778838-3e27-4788-a935-006ae9c8ef94" />
      </body>
      <title>More thoughts on transactional memory</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,94778838-3e27-4788-a935-006ae9c8ef94.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2010/05/17/MoreThoughtsOnTransactionalMemory.aspx</link>
      <pubDate>Mon, 17 May 2010 04:48:58 GMT</pubDate>
      <description>&lt;p&gt;
   &lt;a href="http://www.bluebytesoftware.com/blog/2010/01/03/ABriefRetrospectiveOnTransactionalMemory.aspx"&gt;My
   article about Transactional Memory&lt;/a&gt; (TM) was picked up by a few news feeds recently.
&lt;/p&gt;
&lt;p&gt;
   If I had known this would occur, I would have written it with greater precision.&amp;nbsp;
   Because my article presents a mixture of technical challenges interspersed among more
   subjective and cultural issues, I am sure it is difficult to tease out my intended
   conclusion.&amp;nbsp; To summarize,&amp;nbsp;I merely believe adding TM to a shared memory
   architecture alone is insufficient.
&lt;/p&gt;
&lt;p&gt;
   Indeed, I remain a big fan of transactions.&amp;nbsp; Atomicity, consistency, and isolation,
   and coming up with strategies for achieving all three in tandem, are part and parcel
   to architecting software.
&lt;/p&gt;
&lt;p&gt;
   After &lt;a href="http://www.infoq.com/presentations/liskov-power-of-abstraction"&gt;watching
   Barbara Liskov's OOPSLA Turing Award reprise&lt;/a&gt;, I decided to reacquaint myself with &lt;a href="http://portal.acm.org/citation.cfm?id=42399"&gt;some
   old Argus papers&lt;/a&gt; this weekend.&amp;nbsp; It has been some time since I last read them.&amp;nbsp;
   Argus was Liskov's language for distributed programming and her follow-on to CLU.&amp;nbsp;
   As with most research done by brilliant people, the work was way ahead of its time,
   has appeared in ad-hoc incarnations and permutations over time, and remains relevant
   today.&amp;nbsp; This research is particularly interesting to work that my team is doing
   right now, especially its notion of guardians.&amp;nbsp; And it is relevant to the TM
   discussion too.
&lt;/p&gt;
&lt;p&gt;
   The Argus approach of using isolation to coarsely partition state and operations into
   independent bubbles, and then communicating asynchronously between&amp;nbsp;the so-called
   guardians&amp;nbsp;that are responsible for this state, is an architecture that is common
   among most highly concurrent programs.&amp;nbsp; This aids state management and fault
   tolerance.&amp;nbsp; Argus makes an interesting observation that, although guardians may
   be sent messages concurrently -- and indeed activities themselves may even introduce
   local concurrency -- manipulation of state can be done safely and even in parallel
   thanks to transactions.
&lt;/p&gt;
&lt;p&gt;
   The requirement is that messages are atomic and commute.&amp;nbsp; Transactions, it turns
   out,&amp;nbsp;are a convenient way&amp;nbsp;of implementing this requirement.
&lt;/p&gt;
&lt;p&gt;
   You will observe a similar architecture in other places, including in some languages
   that have adopted TM.&amp;nbsp; Haskell has moved in this direction.&amp;nbsp; Everything
   is purely functional and so, of course, no state is mutated in an unsafe way by default.&amp;nbsp;
   However, with the introduction of concucrrency comes mutable cells for message passing
   and with parallelism comes indeterminism.&amp;nbsp; You can push the state management
   problem up indefinitely, but at the top there are almost always mutable operations
   on real-world state (even if it is "just I/O").&amp;nbsp; Haskell programs have a safe
   architecture to begin with, and it is the intentional and careful addition of specific
   facilities that forces one to focus on the problematic seams.&amp;nbsp; One could say
   that Haskell starts clean and stays clean, versus most shared memory-based languages
   which start dirty and try to attain cleanliness (at least when it comes to concurrency).
&lt;/p&gt;
&lt;p&gt;
   Why aren't transactions sufficient, then, given that the I in ACID stands for Isolation?&amp;nbsp;
   You wouldn't model a database as one flat table in which each row is a single byte,
   however, would you?&amp;nbsp; As you begin to decompose your program into isolated state,
   your bubbles (or guardians) are the tables, and your objects are the rows.&amp;nbsp; This
   is just an analogy but I find it useful to think in these terms.&amp;nbsp; Taking a bunch
   of intermingled state and pouring transactions on top is not going to give you this
   nicely partitioned separation of state which has proven to be&amp;nbsp;the lifeblood of&amp;nbsp;concurrency.
&lt;/p&gt;
&lt;p&gt;
   Even once you've attained a more isolated architecture, however, transactions are
   not a panacea.&amp;nbsp; They are just one of many viable state management techniques
   in a programmer's arsenal, hierarchical state machines being another notable example.&amp;nbsp;
   And in fact, many of the problems I mentioned in the TM article are still worrisome
   even when you start from the right place.&amp;nbsp; From within a guardian, you may wish
   to enlist the aid of another unrelated guardian to perform a coordinated atomic activity,
   because a higher level invariant relationship between them must be preserved.&amp;nbsp;
   Or an application which composes multiple guardians may wish to do so atomically.&amp;nbsp;
   Even Argus required manual compensation for such things.&amp;nbsp; This can be solved
   in part by DTC.&amp;nbsp; But experience suggests that continuing to push the enlistment
   scope one level higher eventually leads to substantial problems.&amp;nbsp; A topic for
   another day, I suppose.
&lt;/p&gt;
&lt;p&gt;
   My primary conclusion is that TM is a great complement to highly concurrent programs,
   but only so long as you start from the right place.&amp;nbsp; The Argus and Haskell approaches
   are conducive to large-scale concurrency, but it is primarily because of the natural
   isolation those models provide; the addition of transactions address problems that
   remain after taking that step.&amp;nbsp; But without that first step, they would have
   gotten nowhere.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=94778838-3e27-4788-a935-006ae9c8ef94" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=ae76cfe5-e205-437a-a2e0-cd91f8b13654</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,ae76cfe5-e205-437a-a2e0-cd91f8b13654.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      We use static analysis very heavily in my project here at Microsoft, as a way of finding
      bugs and/or enforcing policies that would have otherwise gone unenforced.  Many
      of the analyses we rely on are in fact minor extensions to the CLR type system, and
      verge on “effect typing”, an intriguing branch of type systems research that has matured
      significantly over the years. 
   </p>
        <p>
      Many of these annotations are done on methods, rather than types.  A few examples
      include:
   </p>
        <ol>
          <li>
         [MayBlock] indicates that a method is free to call methods that might block. 
      </li>
          <li>
         [NoAllocations] indicates that a method is neither allowed to allocate, nor call another
         method that might allocate 
      </li>
          <li>
         [Throws(...)] indicates that a method is allowed to throw an exception of a type in
         the set { … }, or call other methods that may throw exceptions in the set { … }. 
      </li>
          <li>
         And so on.</li>
        </ol>
        <p>
      It turns out there’s a general way for handling these annotations.  And indeed,
      you will quickly find that pursuing ad-hoc solutions to each independently leads to
      troubles.  We shall briefly look at the generalization.
   </p>
        <p>
      We must first observe that each falls into one of two categories: additive or subtractive. 
      MayBlock and Throws are additive.  They say what is permitted.  NoAllocations,
      on the other hand, is subtractive, because the annotation declares what is not permitted. 
      This distinction, we shall see, is crucial.
   </p>
        <p>
      First we can imagine that each distinct effect shown above has a distinct effect type.
   </p>
        <p>
      The types EMayBlock, ENoAllocations, and EThrows correspond to the annotations above. 
      This will permit us to model effects using subtyping polymorphism.  We will use
      the usual notation, i.e. “S &lt;: T” means “S is a subtype of T”, or “a S is
      substitutable in place of a T”.  For example, String &lt;: Object.  Throws
      is special, because it has a type hierarchy of its own beneath the root type. 
      As you might guess, this hierarchy is infinite in size and is comprised of each possible
      permutation of exception types.
   </p>
        <p>
      There are two special kinds of effects: the null effect (ENil), and a set of other
      effects (EMany).  The latter permits us to create a new, unique effect type merely
      by concatenating a list of other effect types.
   </p>
        <p>
      Each method is then given an EMany effect type containing its full set of effect types. 
      For example:
   </p>
        <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
          <p>
            <code>[MayBlock, Throws(typeof(FileNotFoundException)), NoAllocations]<br />
      void M() { … }</code>
          </p>
        </blockquote>
        <p>
      Is given the distinct effect type EMany { EMayBlock, EThrows(typeof(FileNotFoundException)),
      ENoAllocations }.
   </p>
        <p>
      We should make one generalization before moving on.  ENil ~ EMany { }. 
      In other words, having no effects is equivalent to a list of no effects.  Furthermore,
      EMany { } ~ EMany { ENil }.  In other words, having a list of no effects is equivalent
      to having no effects.
   </p>
        <p>
      Now we are ready to weave everything together.  The main question confronting
      us is as follows: What is the subtyping relationship between the various effect types,
      including the null and list types?
   </p>
        <p>
      The easiest to do away with is the EMany type.  Given two EMany types E and F,
      then E &lt;: F if, for all effects T in E’s type set, there exists an effect type
      U in F’s type set such that T &lt;: U.  In simpler terms, a list is a subtype
      of another list so long as all of its components are also subtypes of a component
      of the other.  This is very abstract, but we shall see soon why it is useful.
   </p>
        <p>
      Now we get to see why the additive and subtractive distinctions are so important:
   </p>
        <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
          <p>
      Given an additive effect type EAdditive, we say ENil &lt;: EAdditive.
   </p>
          <p>
      Given a subtractive effect type ESubtractive, we say ESubtractive &lt;: ENil.
   </p>
        </blockquote>
        <p>
      The first statement says that a method with no effects is substitutable for a method
      with additive effects, and the second says that a method with subtractive effects
      is substitutable for a method with no effects.  The corollaries are perhaps just
      as important.  A method with additive effects cannot take the place of a method
      with no effects, whereas a method with subtractive effects can.
   </p>
        <p>
      For the simple single-effect case, effects depicted in this way represent points on
      a line, where ENil is zero, subtractive effects are negative integers, and additive
      effects are positive integers.  The lattice obviously becomes rather complicated
      as many effects accumulate.
   </p>
        <p>
      Where does substitutability come up with respect to methods, anyway, you may ask? 
      The first is in determining which other methods can be called.  If a method
      M with effects E is trying to call another method N with effects F, this is permitted
      so long as F &lt;: E.  The next is in virtuals and overriding.  A virtual
      with effects E may be overridden by a method with effects F so long as F &lt;: E. 
      The following example illustrates this idea, in addition to the composition of the
      subtyping rules we have shown so far:
   </p>
        <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
          <code>
            <p>
      class C<br />
      {<br />
          public virtual void M() {}<br />
          [MayBlock, NoAllocations]<br />
          public virtual void N() {}<br />
      }
   </p>
            <p>
      class D : C<br />
      {<br />
          [MayBlock, NoAllocations]<br />
          public override void M() {}<br />
          public override void N() {}<br />
      }
   </p>
          </code>
        </blockquote>
        <p>
      In this example, the four methods are given the following effect types:
   </p>
        <ul>
          <li>
         C::M gets EMany { ENil }, or just ENil. 
      </li>
          <li>
         C::N gets EMany { MayBlock, NoAllocations }. 
      </li>
          <li>
         D::M gets EMany { MayBlock, NoAllocations }. 
      </li>
          <li>
         D::N gets EMany { ENil }, or just ENil.</li>
        </ul>
        <p>
      What does all this gibberish mean?  Well it’s straightforward and intuitive,
      actually.
   </p>
        <p>
      We are attempting to add the MayBlock and NoAllocations effects to the overridden
      M method which has none.  Because MayBlock is additive, this is illegal (someone
      might call C::M thinking the code will not block), whereas it is OK for NoAllocations
      (calls through D::M are assured no allocations will happen, even though calls through
      C::M are guaranteed no such thing).  Similarly, we are attempting to remove both
      effects from the overridden N method.  Because MayBlock is additive, this is
      OK (M isn’t required to block, even though calls through C::M may suspect it of doing
      so), whereas it is decidedly not OK for NoAllocations (calls through C::M will reasonable
      assume allocations do not happen, whereas D::M would be free to perform them). 
      It may take some thought to convince yourself that this is correct, but I hope that
      you find that it is.  All of this works because of the subtyping of effect types.
   </p>
        <p>
      All of this works similarly with delegates.  The source delegate signature is
      akin to the base class in the above example, whereas the target method being bound
      to is like the override.
   </p>
        <p>
      Things get a little more complicated when considering the EThrows effect.  It
      is additive, so it is of course true that ENil &lt;: EThrows(*).  However, what
      if we have two different EThrows, and wish to inquire about substitutability of one
      in place of the other?  We can come up with a simple rule that is general purpose
      for all set-of-type kinds of effects.  Namely, consider two instances A and B
      of the same effect type:
   </p>
        <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
          <p>
      Given an additive effect type EAdditive, then A &lt;: B if, for all types T in A’s
      set-of-types, there exists a type U in B’s set-of-types such that T &lt;: U.
   </p>
          <p>
      Given a subtractive effect type ESubtractive, then A &lt;: B if, for all types T in
      A’s set-of-types, there exists a type U in B’s set-of-types such that U &lt;:
      T.
   </p>
        </blockquote>
        <p>
      These sound quite similar, except that they end differently (i.e. T &lt;: U vs. U
      &lt;: T).  We may illustrate the additive case with EThrows; to illustrate the
      subtractive case, let us imagine we can declare a ENoAllocations effect type that
      specifies which precise types may not be allocated:
   </p>
        <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
          <code>
            <p>
      class A{}<br />
      class B : A{}
   </p>
            <p>
      class C<br />
      {<br />
          [Throws(typeof(Exception)), NoAllocations(typeof(A))]<br />
          public virtual void M() {}<br />
          [Throws(typeof(FileNotFoundException)), NoAllocations(typeof(B))]<br />
          public virtual void N() {}<br />
      }
   </p>
            <p>
      class D : C<br />
      {<br />
          [Throws(typeof(FileNotFoundException)), NoAllocations(typeof(B))]<br />
          public override void M() {}<br />
          [Throws(typeof(Exception)), NoAllocations(typeof(A))]<br />
          public override void N() {}<br />
      }
   </p>
          </code>
        </blockquote>
        <p>
      The results should not be surprising.  D::M overrides C::M’s exception list,
      by being more specific and declaring that FileNotFoundException is thrown instead
      of just Exception.  This is OK.  Whereas D::N overrides C::N’s list by being
      more general purpose, specifying Exception instead of FileNotFoundException. 
      This is clearly not OK.  The NoAllocations type works in exactly the reverse. 
      D::M attempts to prohibit allocations of B, but this is merely one possible subtype
      of the base method C::M’s declaration of A, and therefore this is illegal.  Whereas
      D::N ensures no instances of A are allocated, which of course subsumes the base method
      C::N’s declaration that no B’s are allocated.
   </p>
        <p>
      Everything gets a little more interesting when you consider generics.  For example,
      how would we type a general purpose Map method?  (This pattern arises quite frequently.) 
      We would presumably want it to somehow “acquire” the effects of the delegate it invokes
      on all elements in a list.  For example:
   </p>
        <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
          <code>
            <p>
      U[] Map&lt;T, U&gt;(T[] input, Func&lt;T, U&gt; func);
   </p>
          </code>
        </blockquote>
        <p>
      This declaration is stronger than necessary.  The Func&lt;T, U&gt; class – prepackaged
      with the .NET Framework – does not have any effects on it.  So it may not, for
      example, bind to a method that has any additive effects like Throws on it.  This
      is rather unfortunate.
   </p>
        <p>
      To solve this we could imagine treating effects with parametric polymorphism:
   </p>
        <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
          <code>
            <p>
      [Effects(E)]<br />
      U Func&lt;T, U, [EffectParameter] E&gt;(T x);
   </p>
          </code>
        </blockquote>
        <p>
      This fictitious syntax merely says that Func can be instantiated with an effect type
      E, and that the Func “method” itself acquires the effect E.  (Admittedly I should
      stop using faux-attribute syntax for illustrations since we’ve reached this level
      of language integration.)  Now Map can be declared as such:
   </p>
        <blockquote dir="ltr" style="MARGIN-RIGHT: 0px">
          <code>
            <p>
      [Effects(E)]<br />
      U[] Map&lt;T, U, [EffectParameter] E&gt;(T[] input, Func&lt;T, U, E&gt; func);
   </p>
          </code>
        </blockquote>
        <p>
      This says that Map has the same effects as the Func that is supplied as an argument. 
      It turns out that we may want to extend this further, by enabling symbolic manipulation
      of effects.  We may wish, for example, to specify that the Func is not allowed
      to block, by stating it does not have [MayBlock] in it.  You could imagine using
      something very similar to generic constraints to achieve this.  It is also interesting
      to allow concatenation of multiple effect types, both through partial and full specialization. 
      For example, Map above may clearly have effects of its own.  You also tend to
      want generic constraints like, 'where E : F', which of course just depends on the
      aforementioned subtyping rules.  And of course C# 4.0's co- and contravariance
      can be applied to effects too.
   </p>
        <p>
      Anyway, I have probably gone beyond most readers’ interest level in this subject. 
      Things sure do get very interesting when you allow symbolic manipulation of effects. 
      They get even more interesting when you begin to think of types as having “permissible
      effects” attached to them.  However, the main thing I wanted to point out with
      this brief article is that this pattern arises quite frequently.  And despite
      everyone struggling through what seem to be odd corner cases as they develop ad-hoc
      solutions, there really is a sound generalization behind it all.  Many languages
      have first class effect typing, and I have found it liberating to think of many of
      these type system annotations through that lens.  Perhaps you shall too.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=ae76cfe5-e205-437a-a2e0-cd91f8b13654" />
      </body>
      <title>From Simple Annotations for Analysis to Effect Typing</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,ae76cfe5-e205-437a-a2e0-cd91f8b13654.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2010/04/25/FromSimpleAnnotationsForAnalysisToEffectTyping.aspx</link>
      <pubDate>Sun, 25 Apr 2010 16:35:15 GMT</pubDate>
      <description>&lt;p&gt;
   We use static analysis very heavily in my project here at Microsoft, as a way of finding
   bugs and/or enforcing policies that would have otherwise gone unenforced.&amp;nbsp; Many
   of the analyses we rely on are in fact minor extensions to the CLR type system, and
   verge on “effect typing”, an intriguing branch of type systems research that has matured
   significantly over the years. 
&lt;/p&gt;
&lt;p&gt;
   Many of these annotations are done on methods, rather than types.&amp;nbsp; A few examples
   include:
&lt;/p&gt;
&lt;ol&gt;
   &lt;li&gt;
      [MayBlock] indicates that a method is free to call methods that might block. 
   &lt;li&gt;
      [NoAllocations] indicates that a method is neither allowed to allocate, nor call another
      method that might allocate 
   &lt;li&gt;
      [Throws(...)] indicates that a method is allowed to throw an exception of a type in
      the set { … }, or call other methods that may throw exceptions in the set { … }. 
   &lt;li&gt;
      And so on.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;
   It turns out there’s a general way for handling these annotations.&amp;nbsp; And indeed,
   you will quickly find that pursuing ad-hoc solutions to each independently leads to
   troubles.&amp;nbsp; We shall briefly look at the generalization.
&lt;/p&gt;
&lt;p&gt;
   We must first observe that each falls into one of two categories: additive or subtractive.&amp;nbsp;
   MayBlock and Throws are additive.&amp;nbsp; They say what is permitted.&amp;nbsp; NoAllocations,
   on the other hand, is subtractive, because the annotation declares what is not permitted.&amp;nbsp;
   This distinction, we shall see, is crucial.
&lt;/p&gt;
&lt;p&gt;
   First we can imagine that each distinct effect shown above has a distinct effect type.
&lt;/p&gt;
&lt;p&gt;
   The types EMayBlock, ENoAllocations, and EThrows correspond to the annotations above.&amp;nbsp;
   This will permit us to model effects using subtyping polymorphism.&amp;nbsp; We will use
   the usual notation, i.e. “S &amp;lt;: T” means “S is a subtype of T”, or&amp;nbsp;“a S is
   substitutable in place of a T”.&amp;nbsp; For example, String &amp;lt;: Object.&amp;nbsp; Throws
   is special, because it has a type hierarchy of its own beneath the root type.&amp;nbsp;
   As you might guess, this hierarchy is infinite in size and is comprised of each possible
   permutation of exception types.
&lt;/p&gt;
&lt;p&gt;
   There are two special kinds of effects: the null effect (ENil), and a set of other
   effects (EMany).&amp;nbsp; The latter permits us to create a new, unique effect type merely
   by concatenating a list of other effect types.
&lt;/p&gt;
&lt;p&gt;
   Each method is then given an EMany effect type containing its full set of effect types.&amp;nbsp;
   For example:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt; 
&lt;p&gt;
   &lt;code&gt;[MayBlock, Throws(typeof(FileNotFoundException)), NoAllocations]&lt;br&gt;
   void M() { … }&lt;/code&gt;
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
   Is given the distinct effect type EMany { EMayBlock, EThrows(typeof(FileNotFoundException)),
   ENoAllocations }.
&lt;/p&gt;
&lt;p&gt;
   We should make one generalization before moving on.&amp;nbsp; ENil ~ EMany { }.&amp;nbsp;
   In other words, having no effects is equivalent to a list of no effects.&amp;nbsp; Furthermore,
   EMany { } ~ EMany { ENil }.&amp;nbsp; In other words, having a list of no effects is equivalent
   to having no effects.
&lt;/p&gt;
&lt;p&gt;
   Now we are ready to weave everything together.&amp;nbsp; The main question confronting
   us is as follows: What is the subtyping relationship between the various effect types,
   including the null and list types?
&lt;/p&gt;
&lt;p&gt;
   The easiest to do away with is the EMany type.&amp;nbsp; Given two EMany types E and F,
   then E &amp;lt;: F if, for all effects T in E’s type set, there exists an effect type
   U in F’s type set such that T &amp;lt;: U.&amp;nbsp; In simpler terms, a list is a subtype
   of another list so long as all of its components are also subtypes of a component
   of the other.&amp;nbsp; This is very abstract, but we shall see soon why it is useful.
&lt;/p&gt;
&lt;p&gt;
   Now we get to see why the additive and subtractive distinctions are so important:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt; 
&lt;p&gt;
   Given an additive effect type EAdditive, we say ENil &amp;lt;: EAdditive.
&lt;/p&gt;
&lt;p&gt;
   Given a subtractive effect type ESubtractive, we say ESubtractive &amp;lt;: ENil.
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
   The first statement says that a method with no effects is substitutable for a method
   with additive effects, and the second says that a method with subtractive effects
   is substitutable for a method with no effects.&amp;nbsp; The corollaries are perhaps just
   as important.&amp;nbsp; A method with additive effects cannot take the place of a method
   with no effects, whereas a method with subtractive effects can.
&lt;/p&gt;
&lt;p&gt;
   For the simple single-effect case, effects depicted in this way represent points on
   a line, where ENil is zero, subtractive effects are negative integers, and additive
   effects are positive integers.&amp;nbsp; The lattice obviously becomes rather complicated
   as many effects accumulate.
&lt;/p&gt;
&lt;p&gt;
   Where does substitutability come up with respect to methods, anyway, you may ask?&amp;nbsp;
   The first is in determining which other methods can be called.&amp;nbsp;&amp;nbsp;If a method
   M&amp;nbsp;with effects E is trying to call another method N with effects F, this is permitted
   so long as F &amp;lt;: E.&amp;nbsp; The next is in virtuals and overriding.&amp;nbsp; A virtual
   with effects E may be overridden by a method with effects F so long as F &amp;lt;: E.&amp;nbsp;
   The following example illustrates this idea, in addition to the composition of the
   subtyping rules we have shown so far:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt;&lt;code&gt; 
&lt;p&gt;
   class C&lt;br&gt;
   {&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp; public virtual void M() {}&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp; [MayBlock, NoAllocations]&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp; public virtual void N() {}&lt;br&gt;
   }
&lt;/p&gt;
&lt;p&gt;
   class D : C&lt;br&gt;
   {&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp; [MayBlock, NoAllocations]&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp; public override void M() {}&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp; public override void N() {}&lt;br&gt;
   }
&lt;/p&gt;
&lt;/code&gt;&lt;/blockquote&gt; 
&lt;p&gt;
   In this example, the four methods are given the following effect types:
&lt;/p&gt;
&lt;ul&gt;
   &lt;li&gt;
      C::M gets EMany { ENil }, or just ENil. 
   &lt;li&gt;
      C::N gets EMany { MayBlock, NoAllocations }. 
   &lt;li&gt;
      D::M gets EMany { MayBlock, NoAllocations }. 
   &lt;li&gt;
      D::N gets EMany { ENil }, or just ENil.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
   What does all this gibberish mean?&amp;nbsp; Well it’s straightforward and intuitive,
   actually.
&lt;/p&gt;
&lt;p&gt;
   We are attempting to add the MayBlock and NoAllocations effects to the overridden
   M method which has none.&amp;nbsp; Because MayBlock is additive, this is illegal (someone
   might call C::M thinking the code will not block), whereas it is OK for NoAllocations
   (calls through D::M are assured no allocations will happen, even though calls through
   C::M are guaranteed no such thing).&amp;nbsp; Similarly, we are attempting to remove both
   effects from the overridden N method.&amp;nbsp; Because MayBlock is additive, this is
   OK (M isn’t required to block, even though calls through C::M may suspect it of doing
   so), whereas it is decidedly not OK for NoAllocations (calls through C::M will reasonable
   assume allocations do not happen, whereas D::M would be free to perform them).&amp;nbsp;
   It may take some thought to convince yourself that this is correct, but I hope that
   you find that it is.&amp;nbsp; All of this works because of the subtyping of effect types.
&lt;/p&gt;
&lt;p&gt;
   All of this works similarly with delegates.&amp;nbsp; The source delegate signature is
   akin to the base class in the above example, whereas the target method being bound
   to is like the override.
&lt;/p&gt;
&lt;p&gt;
   Things get a little more complicated when considering the EThrows effect.&amp;nbsp; It
   is additive, so it is of course true that ENil &amp;lt;: EThrows(*).&amp;nbsp; However, what
   if we have two different EThrows, and wish to inquire about substitutability of one
   in place of the other?&amp;nbsp; We can come up with a simple rule that is general purpose
   for all set-of-type kinds of effects.&amp;nbsp; Namely, consider two instances A and B
   of the same effect type:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt; 
&lt;p&gt;
   Given an additive effect type EAdditive, then A &amp;lt;: B if, for all types T in A’s
   set-of-types, there exists a type U in B’s set-of-types such that T &amp;lt;: U.
&lt;/p&gt;
&lt;p&gt;
   Given a subtractive effect type ESubtractive, then A &amp;lt;: B if, for all types T in
   A’s set-of-types, there exists a type U in B’s set-of-types such that&amp;nbsp;U &amp;lt;:
   T.
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
   These sound quite similar, except that they end differently (i.e.&amp;nbsp;T &amp;lt;: U vs.&amp;nbsp;U
   &amp;lt;: T).&amp;nbsp; We may illustrate the additive case with EThrows; to illustrate the
   subtractive case, let us imagine we can declare a ENoAllocations effect type that
   specifies which precise types may not be allocated:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt;&lt;code&gt; 
&lt;p&gt;
   class A{}&lt;br&gt;
   class B : A{}
&lt;/p&gt;
&lt;p&gt;
   class C&lt;br&gt;
   {&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp; [Throws(typeof(Exception)), NoAllocations(typeof(A))]&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp; public virtual void M() {}&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp; [Throws(typeof(FileNotFoundException)), NoAllocations(typeof(B))]&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp; public virtual void N() {}&lt;br&gt;
   }
&lt;/p&gt;
&lt;p&gt;
   class D : C&lt;br&gt;
   {&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp; [Throws(typeof(FileNotFoundException)), NoAllocations(typeof(B))]&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp; public override void M() {}&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp; [Throws(typeof(Exception)), NoAllocations(typeof(A))]&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp; public override void N() {}&lt;br&gt;
   }
&lt;/p&gt;
&lt;/code&gt;&lt;/blockquote&gt; 
&lt;p&gt;
   The results should not be surprising.&amp;nbsp; D::M overrides C::M’s exception list,
   by being more specific and declaring that FileNotFoundException is thrown instead
   of just Exception.&amp;nbsp; This is OK.&amp;nbsp; Whereas D::N overrides C::N’s list by being
   more general purpose, specifying Exception instead of FileNotFoundException.&amp;nbsp;
   This is clearly not OK.&amp;nbsp; The NoAllocations type works in exactly the reverse.&amp;nbsp;
   D::M attempts to prohibit allocations of B, but this is merely one possible subtype
   of the base method C::M’s declaration of A, and therefore this is illegal.&amp;nbsp; Whereas
   D::N ensures no instances of A are allocated, which of course subsumes the base method
   C::N’s declaration that no B’s are allocated.
&lt;/p&gt;
&lt;p&gt;
   Everything gets a little more interesting when you consider generics.&amp;nbsp; For example,
   how would we type a general purpose Map method?&amp;nbsp; (This pattern arises quite frequently.)&amp;nbsp;
   We would presumably want it to somehow “acquire” the effects of the delegate it invokes
   on all elements in a list.&amp;nbsp; For example:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt;&lt;code&gt; 
&lt;p&gt;
   U[] Map&amp;lt;T, U&amp;gt;(T[] input, Func&amp;lt;T, U&amp;gt; func);
&lt;/p&gt;
&lt;/code&gt;&lt;/blockquote&gt; 
&lt;p&gt;
   This declaration is stronger than necessary.&amp;nbsp; The Func&amp;lt;T, U&amp;gt; class – prepackaged
   with the .NET Framework – does not have any effects on it.&amp;nbsp; So it may not, for
   example, bind to a method that has any additive effects like Throws on it.&amp;nbsp; This
   is rather unfortunate.
&lt;/p&gt;
&lt;p&gt;
   To solve this we could imagine treating effects with parametric polymorphism:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt;&lt;code&gt; 
&lt;p&gt;
   [Effects(E)]&lt;br&gt;
   U Func&amp;lt;T, U, [EffectParameter] E&amp;gt;(T x);
&lt;/p&gt;
&lt;/code&gt;&lt;/blockquote&gt; 
&lt;p&gt;
   This fictitious syntax merely says that Func can be instantiated with an effect type
   E, and that the Func “method” itself acquires the effect E.&amp;nbsp; (Admittedly I should
   stop using faux-attribute syntax for illustrations since we’ve reached this level
   of language integration.)&amp;nbsp; Now Map can be declared as such:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt;&lt;code&gt; 
&lt;p&gt;
   [Effects(E)]&lt;br&gt;
   U[] Map&amp;lt;T, U, [EffectParameter] E&amp;gt;(T[] input, Func&amp;lt;T, U, E&amp;gt; func);
&lt;/p&gt;
&lt;/code&gt;&lt;/blockquote&gt; 
&lt;p&gt;
   This says that Map has the same effects as the Func that is supplied as an argument.&amp;nbsp;
   It turns out that we may want to extend this further, by enabling symbolic manipulation
   of effects.&amp;nbsp; We may wish, for example, to specify that the Func is not allowed
   to block, by stating it does not have [MayBlock] in it.&amp;nbsp; You could imagine using
   something very similar to generic constraints to achieve this.&amp;nbsp; It is also interesting
   to allow concatenation of multiple effect types, both through partial and full specialization.&amp;nbsp;
   For example, Map above may clearly have effects of its own.&amp;nbsp; You also tend to
   want generic constraints like, 'where E : F', which of course just depends on the
   aforementioned subtyping rules.&amp;nbsp; And of course C# 4.0's co- and contravariance
   can be applied to effects too.
&lt;/p&gt;
&lt;p&gt;
   Anyway, I have probably gone beyond most readers’ interest level in this subject.&amp;nbsp;
   Things sure do get very interesting when you allow symbolic manipulation of effects.&amp;nbsp;
   They get even more interesting when you begin to think of types as having “permissible
   effects” attached to them.&amp;nbsp; However, the main thing I wanted to point out with
   this brief article is that this pattern arises quite frequently.&amp;nbsp; And despite
   everyone struggling through what seem to be odd corner cases as they develop ad-hoc
   solutions, there really is a sound generalization behind it all.&amp;nbsp; Many languages
   have first class effect typing, and I have found it liberating to think of many of
   these type system annotations through that lens.&amp;nbsp; Perhaps you shall too.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=ae76cfe5-e205-437a-a2e0-cd91f8b13654" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=14b37ade-3110-4596-9d6e-bacdcd75baa8</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,14b37ade-3110-4596-9d6e-bacdcd75baa8.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <title>On (Haskell) Type Classes and (C#) Interfaces</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,14b37ade-3110-4596-9d6e-bacdcd75baa8.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2010/03/01/OnHaskellTypeClassesAndCInterfaces.aspx</link>
      <pubDate>Mon, 01 Mar 2010 04:52:59 GMT</pubDate>
      <description>&lt;p&gt;
   Simon Peyton Jones was in town a couple weeks back to deliver a repeat of his ECOOP’09
   keynote, &lt;a href="http://research.microsoft.com/en-us/um/people/simonpj/papers/haskell-retrospective/"&gt;“Classes,
   Jim, but not as we know them. Type classes in Haskell: what, why, and whither”&lt;/a&gt;,
   to a group of internal Microsoft language folks. It was a fantastic talk, and pulled
   together multiple strains of thought that I’ve been pondering lately, most notably
   the common thread amongst them.
&lt;/p&gt;
&lt;p&gt;
   In the talk, he compared polymorphism in Java-like languages (including C# which I
   will switch to referring to over Java hereforth) with ML and Haskell. In other words,
   how does a programmer commonly write code in each language that is maximally reusable?
   Of course, C# programmers primarily achieve this through subclassing, whereas functional
   programmers rely on type parameterization. Over the years, however, the former group
   has begun to borrow a great deal from the latter; as evidence, witness the growingly-pervasive
   use of generics in both Java and C# over the past decade. The talk was given mainly
   through the lens of this evolution, which appears to approach an interesting limit
   if projected far enough into the future.
&lt;/p&gt;
&lt;p&gt;
   Type classes came on the scene towards the end of the 1980’s, and immediately became
   a fertile seed for research and exploration in the relationship between subclass and
   parametric polymorphism. Type classes are much closer to subclass polymorphism than
   Haskell’s borrowed ML-style, which is to say parametric polymorphism. This is most
   intriguing because Haskell does not rely on subclassing, and so the mixture of two
   breeds new patterns.
&lt;/p&gt;
&lt;p&gt;
   I thought that it might be interesting to compare the mixture of subclass and parametric
   polymorphism in Haskell vis-à-vis type classes with the same in C# vis-à-vis a mixture
   of interfaces, generics, and generic constraints. Hence this post. We shall proceed
   by examining some basic type classes in Haskell with their equals in C#. Though similar,
   the dissimilarities are as stark as the similarities. And the lack of higher kinds
   -- particularly when combined with type classes -- means that some Haskell patterns
   simply are not expressible in C#.
&lt;/p&gt;
&lt;p&gt;
   &lt;strong&gt;The Simple Case: Equality (or Lack Thereof)&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;
   The most basic type class of all is Eq, which allows the comparison of two like-typed
   pieces of data. This may seem like a commodity if you ordinarily write code in languages
   like Java and C# which have a strong notion of object identity. In Haskell, however,
   equality is value equality over algebraic data types rather than objects, so polymorphism
   over equality operators is quite a bit more important. Indeed, as we shall see, Haskell’s
   approach is more powerful than == in Java-like languages. (Witness the neverending
   dichotomy between reference and value equality vis-à-vis Object.Equals in C#.) But
   alas, let us proceed by crawling in a series of logical steps, rather than leaping
   to the conclusions.
&lt;/p&gt;
&lt;p&gt;
   Haskell’s Eq type class is defined as such:
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; class Eq a where&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; (==), (/=) :: a -&gt; a -&gt; Bool&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;
   &lt;o:p&gt;
      &lt;font color=#0f243e&gt;&lt;/font&gt;
   &lt;/o:p&gt;
   &lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; x /= y = not (x == y)&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; x == y = not (x /= y)&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   As you see, Eq provides two operators: == and /=. Default implementations of each
   define == as the inverse of /= and /= as the inverse of ==. Not only is this a convenience,
   but it also specifies the desired contract implementations ought to abide by. Other
   types may become members of the Eq class by mapping the one or both operators to type-specific
   functionality. You will immediately recognize the similarity to virtual methods in
   OOP languages, where the operators can be overridden by subclasses.
&lt;/p&gt;
&lt;p&gt;
   Of course all of the primitive data types already implement Eq, so you get value equality
   over numbers, strings, etc. Imagine we declared a new Coords type – comprised of two
   integers – and want to make it a member of Eq also – wherein equality is determined
   by a pairwise comparison of each’s members:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; data Coords = Coords { fst
   :: Integer, snd :: Integer }&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
   We make Coords a member of the Eq type class, and thereby define equality over instance,
   through the ‘instance Eq Coords where’ construct. This maps type class functions to
   real implementation functions. The example here defines them inline, though you may
   of course refer to existing functions instead:
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; instance Eq Coords where&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; (Coords fst1 snd1) == (Coords
   fst2 snd2) = (fst1 == fs2) &amp;&amp; (snd1 == snd2)&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   Now we can take a ‘[Coords]’ and ask whether a particular ‘Coords’ exists within it.
&lt;/p&gt;
&lt;p&gt;
   A function may constrain a type variable to a certain class, and thereby access members
   of that class. For example, the following ‘isin’ function tests whether an instance
   of some type ‘a’ is contained within a list of type ‘[a]’. To do this, it demands
   that ‘a’ is a member of Eq using the syntax “Eq a =&gt;”:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; isin :: Eq a =&gt; a -&gt; [a]
   -&gt; Bool&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; x `isin` [] = False&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; x `isin` (y:ys) = x == y
   || (x `isin` ys)&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   The moral equivalent to the Eq type class in C# is not so easy to decide. The most
   obvious first guess is 
   &lt;br&gt;
   the built-in == and != operators. However, we will quickly find that this is not quite
   right, because these operators are not polymorphic in C#. To illustrate this point,
   let’s try to write the ‘isin’ method in C#, using generics and the == operator, for
   example:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;bool
   IsIn&amp;lt;T&amp;gt;(T x, T[] ys)&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;foreach
   (T y in ys) {&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;if
   (x == y)&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;return
   true;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   This function will not compile. The reason is that == and != in C# are not defined
   over all types (specifically not for value types). You can get IsIn to compile by
   restricting the T to a reference type:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;bool
   IsIn&amp;lt;T&amp;gt;(T x, T[] ys) where T : class&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;…
   same as above …&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   Although this code is deceptively similar to the Haskell example, it is actually quite
   different. The == used to compare two instances compiles into the MSIL CEQ operator,
   effectively hard-coding an object identity comparison. Even if an overloaded == operator
   for a particular instantiated T is available, the compiler will not bind to it. Why?
   Because it is overloading and specifically *not* overriding. For example, say we had
   a MyData type and an overloaded == operator for comparing two instances:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;class
   MyClass&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;public
   static bool operator ==(MyClass a, MyClass b) { return true; }&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;public
   static bool operator !=(MyClass a, MyClass b) { return false; }&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   According to this, all MyClass objects are equal. However, the following call yields
   the answer ‘false’:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;IsIn&amp;lt;MyClass&amp;gt;(new
   MyClass(), new MyClass[] { new MyClass() });&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
   The same problem arises should instances of MyClass get referred to by Object references.
   == and != do not perform any kind of virtual dispatch; the selection of implementation
   is chosen statically.
&lt;/p&gt;
&lt;p&gt;
   Perhaps it is the Equals method inherited from System.Object, then? This, at least,
   is virtual. And indeed, this gets much closer to Eq. Any type may override Equals,
   and a generic definition defined in terms of it dispatches virtually and allows subclasses
   to change behavior on a type-by-type basis:
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;bool
   IsIn&amp;lt;T&amp;gt;(T x, T[] ys)&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;foreach
   (T y in ys) {&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;if
   (x == y || (x != null &amp;&amp; x.Equals(y)))&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;return
   true;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;return
   false;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   (Even this is slightly different, because it assumes a certain type-agnostic behavior
   about nulls.)
&lt;/p&gt;
&lt;p&gt;
   This is cheating, however. We’ve taken advantage of the fact that someone thought
   to put an Equals method on System.Object, thereby giving all Ts such a method. There
   are clearly limits to how many crosscutting things can be added to System.Object before
   it becomes overwhelmed with concepts, not to mention the size (e.g. v-tables). Moreover,
   Equals on Object is weakly typed; a better solution is to use interfaces, like the
   IEquatable&amp;lt;T&amp;gt; interface that introduces a strongly typed Equals method:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;public
   interface IEquatable&amp;lt;T&amp;gt;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;bool
   Equals(T other);&lt;br&gt;
   &lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   And to use a generic type constraint on IsIn’s T, much more akin to what ‘isin’ in
   Haskell above did:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;bool
   IsIn&amp;lt;T&amp;gt;(T x, T[] ys)&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;where
   T : IEquatable&amp;lt;T&amp;gt;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;foreach
   (T y in ys) {&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;if
   (x == y || (x != null &amp;&amp; x.Equals(y)))&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;return
   true;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; return false;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   This is cheating a little less, because we can implement an interface after-the-fact
   without impacting a class’s type hierarchy. This, in fact, looks remarkably similar
   to the Haskell ‘isin’ shown earlier, using type classes and parametric polymorphism,
   where here we have used interfaces in place of type classes.
&lt;/p&gt;
&lt;p&gt;
   We might be tempted to define a default NotEquals method over all IEquatable&amp;lt;T&amp;gt;
   instances, just like Haskell does by implementing the defaults for == and /= as the
   inverse of each other:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;public
   static class Equatable&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;public
   static bool NotEquals&amp;lt;T&amp;gt;(this IEquatable&amp;lt;T&amp;gt; @this, IEquatable&amp;lt;T&amp;gt;
   other)&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;return
   !this.Equals(other);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   This is not perfect. It is not polymorphic; see my previous post for &lt;a href="http://www.bluebytesoftware.com/blog/2010/02/10/ExtensionMethodsAsDefaultInterfaceMethodImplementations.aspx"&gt;an
   extensive discussion&lt;/a&gt; of this and related points. And what about nulls? If '@this'
   is null, the default implementation is going to AV. We’d need to bake in type-agnostic
   knowledge of null again. Sigh!
&lt;/p&gt;
&lt;p&gt;
   Sadly, it turns out this whole approach in general isn’t quite right anyway. For two
   reasons:
&lt;/p&gt;
&lt;ul&gt;
   &lt;li&gt;
      First, we still infect the type in question with the interface being implemented;
      it cannot be done completely outside of the type’s definition, as with type classes.&lt;/li&gt;
   &lt;li&gt;
      Second, type classes in Haskell do not actually require a value of the type in question
      to dispatch against the class’s functions, whereas we clearly do in the above example:
      we need to virtually dispatch against the object, and rely on this virtual dispatch
      to execute different code for each type. This will come up as we look at the numeric
      classes, but it is a critical difference.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;
   A closer analogy is to use IEqualityComparer&amp;lt;T&amp;gt;:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; public interface IEqualityComparer&amp;lt;T&amp;gt;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; {&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; bool Equals(T x, T y);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; }&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   (IEqualityComparer&amp;lt;T&amp;gt; in .NET also has a GetHashCode method on it. Let’s ignore
   that for now.)
&lt;/p&gt;
&lt;p&gt;
   Unfortunately, if our IsIn method were to use IEqualityComparer&amp;lt;T&amp;gt; to do its
   job, callers would be required to pass an instance explicitly; we cannot infer a “default”
   comparer based solely on the T:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;bool
   IsIn&amp;lt;T&amp;gt;(T x, T[] ys, IEqualityComparer&amp;lt;T&amp;gt; eq)&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;foreach
   (T y in ys) {&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;if
   (eq.Equals(x, y))&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;return
   true;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;return
   false;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   Type classes actually function rather similarly, with two major differences:
&lt;/p&gt;
&lt;ol&gt;
   &lt;li&gt;
      The interface object – called a dictionary – is passed and used implicitly.&lt;/li&gt;
   &lt;li&gt;
      The mapping from types to dictionaries is done implicitly, whereas in .NET you’ll
      need to find an instance of the interface in question through other means.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;
   This second difference is solved by a little hack in .NET. If you take a look at the
   EqualityComparer&amp;lt;T&amp;gt;.Default property, you shall see a lot of slightly gross
   reflection code to return an instance of IEqualityComparer&amp;lt;T&amp;gt; for any arbitrary
   T. The code checks some well-known types and conditions, and ultimately falls back
   to the aforementioned interfaces and default Equals method for the most general case.
   It’s not pretty, but it’s a beautiful hack given the tools at our disposal in C#.
&lt;/p&gt;
&lt;p&gt;
   &lt;strong&gt;A Harder Case: Polymorphic Numbers, on Output Parameters&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;
   The Eq type class is easy. The functions it defines are polymorphic on their inputs,
   but not on their outputs; both == and /= return Bool values. Once we transition to
   polymorphic output parameters or return values, we encounter a pattern quite different
   from that which is found in most .NET interfaces.
&lt;/p&gt;
&lt;p&gt;
   Let’s illustrate these differences by looking at Haskell’s Num type class:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;class&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;(Eq
   a, Show a) =&gt; Num a&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;where&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;(+),
   (-), (*)&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;:: a -&gt; a -&gt; a&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;negate&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;::
   a -&gt; a&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;abs,
   signum&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;:: a -&gt; a&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;fromInteger&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;::
   Integer -&gt; a&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   Here we see another feature of Haskell type classes: inheritance. Num derives from
   both Eq and Show – indicated by “(Eq a, Show a) =&gt; Num a” – the latter class of which
   we have not yet shown but is the moral equivalent to .NET’s Object.ToString method.
   It enables pretty printing of values, clearly something that would be expected to
   be common among all numeric data types. Haskell’s numeric class hierarchy is quite
   elegant, enabling highly polymorphic computations. A nice little tutorial of can be
   found here: &lt;a href="http://www.haskell.org/tutorial/numbers.html"&gt;http://www.haskell.org/tutorial/numbers.html&lt;/a&gt;.
&lt;/p&gt;
&lt;p&gt;
   But the question at hand is what the C# equivalent would be.
&lt;/p&gt;
&lt;p&gt;
   Our first approach would be to mimic the IEquatable&amp;lt;T&amp;gt; solution above:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;interface
   INumeric&amp;lt;T&amp;gt;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; &lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt; T
   Add(T d);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; &lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt; T
   Subtract(T d);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; &lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt; T
   Multiply(T d);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; &lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt; T
   Absolute();&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; &lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt; T
   FromInteger(int x);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   This works fine, and primitive types in .NET could presumably implement it:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;struct
   int : INumeric&lt;int&gt;
      &lt;INT&gt;
         { .. }
   &lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;struct
   float : INumeric&lt;float&gt;
      &lt;FLOAT&gt;
         { .. }
   &lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;struct
   double : INumeric&lt;double&gt;
      &lt;DOUBLE&gt;
         { .. }
   &lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;…&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   This enables polymorphic code, like a Sum method, through the use of generic type
   constraints:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;public
   static T Sum&amp;lt;T&amp;gt;(params T[] values)&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;where
   T : INumeric&amp;lt;T&amp;gt;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;T
   accum = default(T);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;foreach
   (T v in values)&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;accum
   = v.Add(accum);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;return
   accum;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   This example works great. Why then, you might wonder, doesn’t LINQ use this instead
   of providing special-case overloads of Average, Min, Max, Sum, etc. for all well-known
   primitive data types?
&lt;/p&gt;
&lt;p&gt;
   The primary reason is the performance hit taken to perform addition through O(N) interface
   calls versus O(N) MSIL ADD instructions. It is just a basic fact of life that today’s
   leading edge separate compilation techniques will not achieve parity with the hand
   specialized variants. While it is true that the JIT compiler *could* specialize the
   code for specific Ts and specific interfaces to emit more efficient instructions,
   like int, float, etc. over INumeric&amp;lt;T&amp;gt; calls, it will not do so today. This
   reduces the ability to share code – which admittedly is what we want here – and is
   tangled up in a judgment call based on heuristics. But I digress.
&lt;/p&gt;
&lt;p&gt;
   There is a larger problem that arises with other examples, at least from a language
   expressiveness point-of-view: the need to have an instance in hand to invoke interface
   methods. FromInteger, for example, is rather awkward to write. In fact, we cannot
   write a method with INumeric&amp;lt;T&amp;gt; like we could in Haskell:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;public
   static T MakeT&amp;lt;T&amp;gt;(int value)&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;where
   T : INumeric&amp;lt;T&amp;gt;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;…
   ? …&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   How do we invoke FromInteger, given that no T is available at the time of MakeT’s
   invocation? You can’t; you need to arrange for an instance to be available. There
   are ways out of this corner. One solution is to mandate that T has a default constructor:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;public
   static T MakeT&amp;lt;T&amp;gt;(int value)&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;where
   T : INumeric&amp;lt;T&amp;gt;, new()&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;return
   new T(value).FromInteger(value);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   That is always acceptable for structs, since they always have such a constructor;
   but this practice requires that classes be designed to possibly not hold invariants
   at all times, and so is not always acceptable or at the very least requires design
   accommodation.
&lt;/p&gt;
&lt;p&gt;
   The alternative is probably obvious. Use a similar approach to IEqualityComparer&amp;lt;T&amp;gt;:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;interface
   INumericProvider&amp;lt;T&amp;gt;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; &lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt; T
   Add(T x, T y);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; &lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt; T
   Subtract(T x, T y);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; &lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt; T
   Multiply(T x, T y);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; &lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt; T
   Absolute(T x);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; &lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt; T
   FromInteger(int x);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   And now, of course, each method that does polymorphic number crunching must accept
   an instance of INumericProvider&amp;lt;T&amp;gt;. That’s particularly cumbersome, so it’s
   more likely that .NET developers would prefer the aforementioned approach, where the
   type must provide a default constructor.
&lt;/p&gt;
&lt;p&gt;
   Admittedly, I seldom run into this particular problem in practice; but when I do,
   I really wish I had something like Haskell type classes to help me out.
&lt;/p&gt;
&lt;p&gt;
   Before moving on, it is worth pointing out one Haskell type class problem that explicit
   interface object passing in .NET helps to avoid. Should you need multiple implementations
   of a given class for the same type, as is relatively common with equality comparisons,
   you must disambiguate in Haskell by separation by module and being careful about what
   you import. This is similar to C#’s extension methods. With explicitly passed interface
   objects, however, it is trivial to manage and pass separate objects if you’d like.
&lt;/p&gt;
&lt;p&gt;
   &lt;strong&gt;Close, but No Cigar: Higher Kinds&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;
   There is one last feature that Haskell provides – a pretty big one, I might add –
   that C# simply cannot do: higher kinded types, or polymorphism over constructed types.
   This feature is orthogonal to type classes, but gets used pervasively in conjunction
   with them. An example will make this stunningly clear:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;class&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;Monad
   m&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;where&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;(&gt;&gt;=)&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;::
   m a -&gt; (a -&gt; m b) -&gt; m b&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;(&gt;&gt;)&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;::
   m a -&gt; m b -&gt; m b&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;return&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;::
   a -&gt; m a&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;fail&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;::
   String -&gt; m a&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;
   &lt;o:p&gt;
      &lt;font color=#0f243e&gt;&lt;/font&gt;
   &lt;/o:p&gt;
   &lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;m
   &gt;&gt; k&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;=&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;m
   &gt;&gt;= \_ -&gt; k&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;fail
   s&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;= error s&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
   Let’s try to transcribe the core of this class in C#, renaming &gt;&gt;= to Bind, and omitting
   the &gt;&gt; and ‘fail s’ operators because they have default implementations:
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;public
   interface IMonad&amp;lt;M, A&amp;gt;&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;&lt;A, M&amp;lt;B&amp;gt;&gt;M&amp;lt;B&amp;gt;
   Bind&amp;lt;B&amp;gt;(M&amp;lt;A&amp;gt; m, Func k);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;M&amp;lt;A&amp;gt;
   Return(A a);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;M&amp;lt;A&amp;gt;
   Fail(string s);&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   This approach is tantalizingly close. It suffers from the already-admitted problem
   that, for any M&amp;lt;A&amp;gt; instance, you will need to pass the appropriate IMonad&amp;lt;A&amp;gt;
   provider object – just as with the IEqualityComparer&amp;lt;T&amp;gt; and INumericProvider&amp;lt;T&amp;gt;
   examples above.
&lt;/p&gt;
&lt;p&gt;
   But the code of course won’t *actually* compile, because the type variable M cannot
   be constructed as shown here. We find references to M&amp;lt;A&amp;gt; and M&amp;lt;B&amp;gt;, which
   are complete nonsense to C#. M is just a plain type variable. M is required to be
   what Haskell calls a type constructor (* -&gt; *), which is a generic type that must
   be instantiated before it is a terminal type. &lt;a href="http://www.bluebytesoftware.com/blog/2008/11/04/LongingForHigherkindedC.aspx"&gt;I’ve
   written about this before&lt;/a&gt;. Although it seems like a trivial omission in C#’s language
   definition, it strikes at the heart of the type system.
&lt;/p&gt;
&lt;p&gt;
   A fictitious syntax for expressing this in C# might be:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;public
   interface IMonad&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;&lt;&gt;&lt;/FONT&gt;where
   M : 
   &lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;{&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt; ...&lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   And if, say, M were expected to be a two- or three-parametered type, we would find,
   respectively:
&lt;/p&gt;
&lt;p&gt;
&lt;/p&gt;
&lt;pre&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; ... &lt;/span&gt;&lt;,&gt;where
   M : &lt;/font&gt;&lt;/span&gt; 
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-FAMILY: Consolas"&gt;&lt;font color=#0f243e&gt;&lt;span style="mso-spacerun: yes"&gt; ... &lt;/span&gt;&lt;,,&gt;where
   M : &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/pre&gt;
&lt;p&gt;
&lt;p&gt;
   And so on.
&lt;/p&gt;
&lt;p&gt;
   This could in theory work. But C# -- and more worrisome .NET and the CLR – do not
   support this presently, and, to be quite honest, likely never will. It is immensely
   powerful, however. Life without monads is a life destined to continuous repetition.
   The “LINQ Pattern”, for example, is one example case in .NET where, for each ‘source’
   type, we must create a “copy” of the original System.Linq.Enumerable variant. And
   shame on those who wish to write polymorphic code that will work for any LINQ provider.
&lt;/p&gt;
&lt;p&gt;
   &lt;strong&gt;Winding Down&lt;/strong&gt;
&lt;/p&gt;
&lt;p&gt;
   Let’s wind down. I need to go grab dinner at Mama's Fish House on Maui right now.
&lt;/p&gt;
&lt;p&gt;
   I hope to have shown some of the similarities and dissimilarities between type classes
   and interfaces, and some patterns that arise when these things are mixed with parametric
   polymorphism. The mix of inheritance for type classes, but not for implementation
   types, in Haskell is unique. C#, of course, allows inheritance both amongst interfaces
   and implementations which is both a blessing and a curse.
&lt;/p&gt;
&lt;p&gt;
   I do think both camps have something to teach one another. For example, having a default
   interface lookup mechanism for arbitrary types in C# would be wonderful, and indeed
   might provide a replacement for extension methods that has more longevity. I’m sure
   much of this will happen with time; either “in place” as the respective languages
   evolve, or as new languages are created with time. 
&lt;/p&gt;
&lt;p&gt;
   But most importantly, I hope that the blog post was educational and fun. Enjoy.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=14b37ade-3110-4596-9d6e-bacdcd75baa8" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=b9072376-beaf-4d17-a6c4-4bfccfbd34b0</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,b9072376-beaf-4d17-a6c4-4bfccfbd34b0.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <title>Extension methods as default interface method implementations</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,b9072376-beaf-4d17-a6c4-4bfccfbd34b0.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2010/02/10/ExtensionMethodsAsDefaultInterfaceMethodImplementations.aspx</link>
      <pubDate>Wed, 10 Feb 2010 02:21:49 GMT</pubDate>
      <description>&lt;p&gt;
   One of my comments in the 2nd edition of the .NET Framework design guidelines (on
   page 164) was that you can use extension methods as a way of getting default implementations
   for interface methods.&amp;nbsp; We've actually begun using these techniques here on my
   team.&amp;nbsp; To illustrate this trick, let's rewind the clock and imagine we were designing
   new collections APIs from day one.
&lt;/p&gt;
&lt;p&gt;
   Let's say we gave the core interfaces the most general methods possible.&amp;nbsp; These
   may neither be the most user friendly overloads nor the ones that most people use
   all the time.&amp;nbsp; They would, however, be those from which all the other convenience
   methods could be implemented.&amp;nbsp; An INewList&amp;lt;T&amp;gt; interface that was designed
   with these principles in mind may look like this:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt; 
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /&gt;public
   interface INewList&amp;lt;T&amp;gt; : IEnumerable&amp;lt;T&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;int
   Count { get; }&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;T
   this[int index] { get; set; }&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;
   &lt;o:p&gt;
      &lt;font color=#000000&gt;&amp;nbsp;&lt;/font&gt;
   &lt;/o:p&gt;
   &lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;void
   InsertAt(int index, T item);&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;void
   RemoveAt(int index);&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
   This interface is missing all the nice convenience methods you will find on .NET's
   IList&amp;lt;T&amp;gt;, like Add, Clear, Contains, CopyTo, IndexOf, and Remove.&amp;nbsp; So it's
   not really as nice to use.&amp;nbsp; You can't write an API that takes in an INewList&amp;lt;T&amp;gt;
   and performs an Add against it, for example, like you can with IList&amp;lt;T&amp;gt;.
&lt;/p&gt;
&lt;p&gt;
   One approach to solving this might be to write a concrete class -- much like .NET's
   System.Collections.ObjectModel.Collection&amp;lt;T&amp;gt; -- that provides concrete implementations
   of all of these methods, and then other lists can simply subclass that.&amp;nbsp; But
   we can do better.
&lt;/p&gt;
&lt;p&gt;
   Instead, let's give INewList&amp;lt;T&amp;gt; default implementations of all of these methods.&amp;nbsp;
   How do we do this?&amp;nbsp; That's right: with extension methods.&amp;nbsp; Voila!
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt; 
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;public
   static class NewListExtensions&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;public
   static void Add&amp;lt;T&amp;gt;(this INewList&amp;lt;T&amp;gt; lst, T item)&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;lst.InsertAt(lst.Count,
   item);&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;
   &lt;o:p&gt;
      &lt;font color=#000000&gt;&amp;nbsp;&lt;/font&gt;
   &lt;/o:p&gt;
   &lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;public
   static void Clear&amp;lt;T&amp;gt;(this INewList&amp;lt;T&amp;gt; lst)&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;int
   count;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;while
   ((count = lst.Count) &amp;gt; 0) {&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;lst.RemoveAt(count
   - 1);&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;
   &lt;o:p&gt;
      &lt;font color=#000000&gt;&amp;nbsp;&lt;/font&gt;
   &lt;/o:p&gt;
   &lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;public
   static bool Contains&amp;lt;T&amp;gt;(this INewList&amp;lt;T&amp;gt; lst, T item)&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;return
   lst.IndexOf(item) != -1;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;
   &lt;o:p&gt;
      &lt;font color=#000000&gt;&amp;nbsp;&lt;/font&gt;
   &lt;/o:p&gt;
   &lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;public
   static void CopyTo&amp;lt;T&amp;gt;(this INewList&amp;lt;T&amp;gt; lst, T[] array, int arrayIndex)&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;for
   (int i = 0; i &amp;lt; lst.Count; i++) {&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;array[arrayIndex
   + i] = lst[i];&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;
   &lt;o:p&gt;
      &lt;font color=#000000&gt;&amp;nbsp;&lt;/font&gt;
   &lt;/o:p&gt;
   &lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;public
   static int IndexOf&amp;lt;T&amp;gt;(this INewList&amp;lt;T&amp;gt; lst, T item)&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;var
   eq = EqualityComparer&amp;lt;T&amp;gt;.Default;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;for
   (int i = 0; i &amp;lt; lst.Count; i++) {&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;if
   (eq.Equals(item, lst[i])) {&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;return
   i;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;return
   -1;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;
   &lt;o:p&gt;
      &lt;font color=#000000&gt;&amp;nbsp;&lt;/font&gt;
   &lt;/o:p&gt;
   &lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;public
   static bool Remove&amp;lt;T&amp;gt;(this INewList&amp;lt;T&amp;gt; lst, T item)&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;int
   index = lst.IndexOf(item);&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;if
   (index == -1) {&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;return
   false;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;
   &lt;o:p&gt;
      &lt;font color=#000000&gt;&amp;nbsp;&lt;/font&gt;
   &lt;/o:p&gt;
   &lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;lst.RemoveAt(index);&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;return
   true;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
   Well isn't that neat.&amp;nbsp; We've now given any INewList&amp;lt;T&amp;gt; implementations
   all these common methods without dirtying their class hierarchies, built atop a tiny
   core of&amp;nbsp;extensibility.&amp;nbsp; This is much like .NET's Collection&amp;lt;T&amp;gt; which
   exposes the core as abstract methods.&amp;nbsp; Indeed, we can go even further.&amp;nbsp;
   Any convenience overloads, like the multitude of CopyTos on List&amp;lt;T&amp;gt; in .NET,
   can be given to all INewList&amp;lt;T&amp;gt;'s also.&amp;nbsp; And yet implementing INewList&amp;lt;T&amp;gt;
   remains as braindead simple as it was before: two properties and two methods.&amp;nbsp;
   In fact, it's simpler than doing a more feature-rich IList&amp;lt;T&amp;gt;, because the convenience
   methods come for free.
&lt;/p&gt;
&lt;p&gt;
   It would be even niftier if you could add these methods straight onto INewList&amp;lt;T&amp;gt;,
   and have the C# compiler emit the extension methods silently for you.&amp;nbsp; In other
   words:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt; 
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;public
   interface INewList&amp;lt;T&amp;gt; : IEnumerable&amp;lt;T&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;...
   interface methods (as above) ...&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;
   &lt;o:p&gt;
      &lt;font color=#000000&gt;&amp;nbsp;&lt;/font&gt;
   &lt;/o:p&gt;
   &lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;void
   Add(T item)&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;InsertAt(Count,
   item);&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;
   &lt;o:p&gt;
      &lt;font color=#000000&gt;&amp;nbsp;&lt;/font&gt;
   &lt;/o:p&gt;
   &lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;void
   Clear()&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;int
   count;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;while
   ((count = Count) &amp;gt; 0) {&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;RemoveAt(count
   - 1);&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;
   &lt;o:p&gt;
      &lt;font color=#000000&gt;&amp;nbsp;&lt;/font&gt;
   &lt;/o:p&gt;
   &lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;...
   and so on ...&lt;br&gt;
   &lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"&gt;&lt;font color=#000000&gt;}&lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas; mso-bidi-font-family: 'Times New Roman'; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA"&gt;&lt;/span&gt;Although
   this would just be sugar for the&amp;nbsp;NewListExtensions class shown earlier, it sure
   saves some typing and makes it the pattern more apparent and first class.
&lt;/p&gt;
&lt;p&gt;
   Though cool, this whole idea is certainly not perfect.
&lt;/p&gt;
&lt;p&gt;
   For one, there are no extension properties.&amp;nbsp; So you can't use this trick for
   properties.
&lt;/p&gt;
&lt;p&gt;
   But the more obvious and severe downside to this approach that these methods are not
   specialized for the given concrete type.&amp;nbsp; For example, the Clear method is potentially
   far less efficient than a hand-rolled List&amp;lt;T&amp;gt;, because it does O(N) RemoveAts
   rather than a single O(1) fixup of the count.
&lt;/p&gt;
&lt;p&gt;
   Recall now that the compiler binds more tightly to instance methods than extension
   methods.&amp;nbsp; So we could implement our own little list class with a faster Clear
   method if we'd like:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt; 
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;class
   MyList : INewList&amp;lt;T&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;...
   the two properties and two methods from INewList&amp;lt;T&amp;gt; ...&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;
   &lt;o:p&gt;
      &lt;font color=#000000&gt;&amp;nbsp;&lt;/font&gt;
   &lt;/o:p&gt;
   &lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;public
   void Clear()&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;..
   efficient! ...&lt;br&gt;
   &lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
   Now when someone calls Clear on a MyList&amp;lt;T&amp;gt; directly, the compiler will bind
   to the efficient Clear.
&lt;/p&gt;
&lt;p&gt;
   This is still not perfect.&amp;nbsp; If you pass the MyList&amp;lt;T&amp;gt; to an API that takes
   in an INewList&amp;lt;T&amp;gt;, any calls to Clear will fall back to the extension method.&amp;nbsp;
   Extension methods are not virtual in any way.&amp;nbsp; You can try to simulate virtual
   dispatch, but it gets messy quick.&amp;nbsp; For example, say we defined an IFasterList&amp;lt;T&amp;gt;
   that includes all those convenience methods that lists frequently want to make faster;
   we can then do a typecheck plus virtual dispatch in the extension method.
&lt;/p&gt;
&lt;p&gt;
   For now, let's pretend that's just the Clear method:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt; 
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;public
   interface IFasterList&amp;lt;T&amp;gt; : INewList&amp;lt;T&amp;gt;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;void
   Clear();&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
   Of course, MyList&amp;lt;T&amp;gt; above would now implement IFasterList&amp;lt;T&amp;gt;.&amp;nbsp; Invocations
   through IFasterList&amp;lt;T&amp;gt; will automatically bind to the faster variant; but if
   objects that implement IFasterList&amp;lt;T&amp;gt; get passed around as IList&amp;lt;T&amp;gt;s,
   you lose this ability.&amp;nbsp; So the Clear extension method can now do a typecheck:
&lt;/p&gt;
&lt;blockquote dir=ltr style="MARGIN-RIGHT: 0px"&gt; 
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;public
   static void Clear&amp;lt;T&amp;gt;(this INewList&amp;lt;T&amp;gt; lst)&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;{&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;IFasterList&amp;lt;T&amp;gt;
   fstLst = lst as IFasterList&amp;lt;T&amp;gt;;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/span&gt;if
   (fstLst != null) {&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;fstLst.Clear();&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;return;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;
   &lt;o:p&gt;
      &lt;font color=#000000&gt;&amp;nbsp;&lt;/font&gt;
   &lt;/o:p&gt;
   &lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;int
   count;&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;while
   ((count = lst.Count) &amp;gt; 0) {&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;lst.RemoveAt(count
   - 1);&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;p class=MsoNormal style="MARGIN: 0in 0in 0pt"&gt;
   &lt;span style="FONT-SIZE: 10pt; LINE-HEIGHT: 115%; FONT-FAMILY: Consolas"&gt;&lt;font color=#000000&gt;&lt;span style="mso-spacerun: yes"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;}&lt;o:p&gt;&lt;/o:p&gt;
   &lt;/font&gt;&lt;/span&gt;
&lt;/p&gt;
&lt;/blockquote&gt; 
&lt;p&gt;
   This works but is obviously a tedious and hard-to-maintain solution.&amp;nbsp; It would
   be neat if someday C# figured out a way to "magically" reconcile virtual dispatch
   and extension methods.&amp;nbsp; I don't know if there is a clever solution out there.&amp;nbsp;
   I am skeptical.&amp;nbsp; Nevertheless, despite this flaw, the above techniques are certainly
   thought provoking and interesting enough to play around with and consider for your
   own projects.&amp;nbsp; And at the very least, it's fun.&amp;nbsp; Enjoy.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=b9072376-beaf-4d17-a6c4-4bfccfbd34b0" /&gt;</description>
      <category>Technology</category>
    </item>
    <item>
      <trackback:ping>http://www.bluebytesoftware.com/blog/Trackback.aspx?guid=53e892af-d387-4283-9b10-a463f8ac3eba</trackback:ping>
      <pingback:server>http://www.bluebytesoftware.com/blog/pingback.aspx</pingback:server>
      <pingback:target>http://www.bluebytesoftware.com/blog/PermaLink,guid,53e892af-d387-4283-9b10-a463f8ac3eba.aspx</pingback:target>
      <dc:creator>
      </dc:creator>
      <body xmlns="http://www.w3.org/1999/xhtml">
        <p>
      Sometimes you need to wait for something before proceeding with a computation.
   </p>
        <p>
      Perhaps you need to know the value of some integer that is being computed concurrently. 
      Maybe you need to wait for the bytes to flush to disk before telling another process
      the file is consistent and ready to read.  Or you need to get that next row back
      from the database before painting it on the UI.  It could be that you need to
      wait for the missile to leave the bay before closing the bay door.  And so on.
   </p>
        <p>
      And sometimes there’s simply nothing better to do while waiting for these things to
      happen other than to let the CPU halt (or let other processes on the machine run). 
      You need to twiddle your thumbs a bit, and exhibit a little patience.  Or at
      least your program does.  This is simply an unfortunate fact of life.
   </p>
        <p>
      This manifests numerous ways in our programming models:
   </p>
        <p>
              1) Waiting on an event.<br />
              2) Waiting to acquire an already-held
      lock.<br />
              3) Finding that the GUI message queue
      is empty and doing a MsgWaitForMultipleObjectsEx.<br />
              4) Finding that the COM RPC queue
      is empty and doing a CoWaitForMultipleHandles.<br />
              5) Issuing an Ada rendezvous ‘accept’
      and finding that no messages await you, thus blocking.<br />
              6) Issuing an Erlang ‘receive’ and
      finding that no messages await you, thus blocking.<br />
              7) Waiting on a .NET 4.0 task.<br />
              8) Issuing a ContinueWith on a .NET
      4.0 task.<br />
              9) And so on.
   </p>
        <p>
      There are three big distinctions to make about the characteristic nature
      of this waiting: namely, (1) what condition's establishment is being sought --
      i.e. the reason for the wait, (2) whether multiple such conditions of interest
      may be waited on simultaneously, and, related, (3) whether waiting for said condition(s)
      necessarily means that the processing of some other conditions that may arise elsewhere,
      but require the blocked context to run, cannot occur.
   </p>
        <p>
      I will be the first to admit that this statement is rather abstract.  But it
      really does matter.
   </p>
        <p>
      For example, MsgWaitForMultipleObjectsEx is a pumping wait.  Not only do you
      wait for the occurrence of one of several events to get set, but the arrival
      of a new top-level message at the message queue (either GUI or COM RPC-related) causes
      immediate processing of that message, presuming the thread is blocked at that call
      at the time.  Although you can be deeply nested in some complicated code, you
      “jump” to the event loop to run the message handling code.  Vanilla WaitForMultipleObjectsEx
      works in a similar way vis-à-vis APCs, provided the wait is alertable.  This
      is quite different from a fully blocking non-pumping wait, which only waits for one
      or more very specific events, but does not dispatch messages simultaneously.
   </p>
        <p>
      Win32 esoterica aside, the concepts appear elsewhere.  The moral equivalent
      in Ada or Erlang is to do a selective-accept or -receive, intentionally not dispatching
      certain messages that might arrive in the meantime.  (To be fair, you can also
      do this in COM with message filters.)  This often happens when you nest accepts
      and receives.  You may be capable of processing messages A-Z at the top-level
      tail recursive loop; but if that nested accept only knows about message kinds M and
      N, then there are 24 other kinds that will not be picked up in the meantime.
   </p>
        <p>
      Not pumping for messages is dangerous.  And it can lead to deadlock if you pump
      for the wrong ones.  Like if you’re accepting M or N, yet the triggering of M
      or N depends on first processing some message K waiting in the queue.  COM RPCs
      with cycles run face first into this.  And/or not pumping can lead to responsiveness
      and scalability problems.  Perhaps M or N eventually does arrive, yet little
      old K needs to wait an indeterminate amount of time before it is seen.  Whereas
      we could have overlapped its processing.  This is why most STAs pump while waiting,
      and, similarly, why many Erlang processes consist of a main loop that is prepared
      to handle any message the process accepts at that top level loop.  They may seem
      very different but they are strikingly not.
   </p>
        <p>
      Yet paradoxically pumping for messages is also dangerous.  You must predict all
      the kinds of messages that may reentrantly get executed, and your state at the point
      of the blocking call must be consistent enough to tolerate them.  (At least those
      that will actually happen.)  In COM STAs, this can be wholly unpredictable and
      indeed because the CLR auto-pumps on STAs the blocking points can be hidden. 
      Overly aggressively admitting messages may seem like the right thing to do, until
      you’ve wedged yourself into some unforeseen inconsistent state.  You can avoid
      this by making each message handler atomic; see Argus.  But if you can't or don't
      have the discipline to do that, or aren't quite sure, you must not pump.  You
      either avoid pumping altogether or you selectively pump messages that do not
      touch the state encapsulated by the pump.  Or you lock access to state with a
      non-recursive lock and run the risk of deadlock.
   </p>
        <p>
      I have found it clarifying to think about blocking in event loop concurrency
      and state machine terms, advancing from one state to the next in between waits. 
      It’s a slippery model, but particularly when working in message passing systems that
      employ event loops, it can help to identify all the familiar problems with shared
      memory, blocking, and consistency.
   </p>
        <p>
      Indeed it is interesting how blocking and non-blocking systems can rapidly approach
      each other.  Starting from either extreme tempts you to tiptoe closer and closer
      to the middle.  The familiarity of the other extreme tempts you.  Until,
      alas, you just might meet in the middle.
   </p>
        <img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=53e892af-d387-4283-9b10-a463f8ac3eba" />
      </body>
      <title>Musing on messages and blocking</title>
      <guid>http://www.bluebytesoftware.com/blog/PermaLink,guid,53e892af-d387-4283-9b10-a463f8ac3eba.aspx</guid>
      <link>http://www.bluebytesoftware.com/blog/2010/01/08/MusingOnMessagesAndBlocking.aspx</link>
      <pubDate>Fri, 08 Jan 2010 08:11:39 GMT</pubDate>
      <description>&lt;p&gt;
   Sometimes you need to wait for something before proceeding with a computation.
&lt;/p&gt;
&lt;p&gt;
   Perhaps you need to know the value of some integer that is being computed concurrently.&amp;nbsp;
   Maybe you need to wait for the bytes to flush to disk before telling another process
   the file is consistent and ready to read.&amp;nbsp; Or you need to get that next row back
   from the database before painting it on the UI.&amp;nbsp; It could be that you need to
   wait for the missile to leave the bay before closing the bay door.&amp;nbsp; And so on.
&lt;/p&gt;
&lt;p&gt;
   And sometimes there’s simply nothing better to do while waiting for these things to
   happen other than to let the CPU halt (or let other processes on the machine run).&amp;nbsp;
   You need to twiddle your thumbs a bit, and exhibit a little patience.&amp;nbsp; Or at
   least your program does.&amp;nbsp; This is simply an unfortunate fact of life.
&lt;/p&gt;
&lt;p&gt;
   This manifests numerous&amp;nbsp;ways in our programming models:
&lt;/p&gt;
&lt;p&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1)&amp;nbsp;Waiting on an event.&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2)&amp;nbsp;Waiting to acquire an already-held
   lock.&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3)&amp;nbsp;Finding that the GUI message queue
   is empty and doing a MsgWaitForMultipleObjectsEx.&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 4)&amp;nbsp;Finding that the COM RPC queue
   is empty and doing a CoWaitForMultipleHandles.&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 5)&amp;nbsp;Issuing an Ada rendezvous ‘accept’
   and finding that no messages await you, thus blocking.&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 6)&amp;nbsp;Issuing an Erlang ‘receive’ and
   finding that no messages await you, thus blocking.&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 7)&amp;nbsp;Waiting on a .NET 4.0 task.&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 8)&amp;nbsp;Issuing a ContinueWith on a .NET
   4.0 task.&lt;br&gt;
   &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 9)&amp;nbsp;And so on.
&lt;/p&gt;
&lt;p&gt;
   There&amp;nbsp;are three big&amp;nbsp;distinctions to make about the characteristic nature
   of this waiting: namely, (1) what condition's establishment&amp;nbsp;is being sought --
   i.e. the reason for the wait, (2) whether&amp;nbsp;multiple such conditions of interest
   may be waited on simultaneously, and, related, (3) whether waiting for said condition(s)
   necessarily means that the processing of some other conditions that may arise elsewhere,
   but require the blocked context to run,&amp;nbsp;cannot occur.
&lt;/p&gt;
&lt;p&gt;
   I will be the first to admit that this statement is rather abstract.&amp;nbsp; But it
   really does&amp;nbsp;matter.
&lt;/p&gt;
&lt;p&gt;
   For example, MsgWaitForMultipleObjectsEx is a pumping wait.&amp;nbsp; Not only do you
   wait for the occurrence of one of several events to get set, but&amp;nbsp;the arrival
   of a new top-level message at the message queue (either GUI or COM RPC-related) causes
   immediate processing of that message, presuming the thread is blocked at that call
   at the time.&amp;nbsp; Although you can be deeply nested in some complicated code, you
   “jump” to the event loop to run the message handling code.&amp;nbsp; Vanilla WaitForMultipleObjectsEx
   works in a similar way vis-à-vis APCs, provided the wait is alertable.&amp;nbsp; This
   is quite different from a fully blocking non-pumping wait, which only waits for one
   or more very specific events, but does not dispatch messages simultaneously.
&lt;/p&gt;
&lt;p&gt;
   Win32 esoterica aside, the concepts appear elsewhere.&amp;nbsp;&amp;nbsp;The moral equivalent
   in Ada or Erlang is to do a selective-accept or -receive, intentionally not dispatching
   certain messages that might arrive in the meantime.&amp;nbsp; (To be fair, you can also
   do this in COM with message filters.)&amp;nbsp; This often happens when you nest accepts
   and receives.&amp;nbsp; You may be capable of processing messages A-Z at the top-level
   tail recursive loop; but if that nested accept only knows about message kinds M and
   N, then there are 24 other kinds that will not be picked up in the meantime.
&lt;/p&gt;
&lt;p&gt;
   Not pumping for messages is dangerous.&amp;nbsp; And it can lead to deadlock if you pump
   for the wrong ones.&amp;nbsp; Like if you’re accepting M or N, yet the triggering of M
   or N depends on first processing some message K waiting in the queue.&amp;nbsp; COM RPCs
   with cycles run face first into this.&amp;nbsp; And/or not pumping can lead to responsiveness
   and scalability problems.&amp;nbsp; Perhaps M or N eventually does arrive, yet little
   old K needs to wait an indeterminate amount of time before it is seen.&amp;nbsp; Whereas
   we could have overlapped its processing.&amp;nbsp; This is why most STAs pump while waiting,
   and, similarly, why many Erlang processes consist of a main loop that is prepared
   to handle any message the process accepts at that top level loop.&amp;nbsp; They may seem
   very different but they are strikingly not.
&lt;/p&gt;
&lt;p&gt;
   Yet paradoxically pumping for messages is also dangerous.&amp;nbsp; You must predict all
   the kinds of messages that may reentrantly get executed, and your state at the point
   of the blocking call must be consistent enough to tolerate them.&amp;nbsp; (At least those
   that will actually happen.)&amp;nbsp; In COM STAs, this can be wholly unpredictable and
   indeed because the CLR auto-pumps on STAs the blocking points can be hidden.&amp;nbsp;
   Overly aggressively admitting messages may seem like the right thing to do, until
   you’ve wedged yourself into some unforeseen inconsistent state.&amp;nbsp; You can avoid
   this by making each message handler atomic; see Argus.&amp;nbsp; But if you can't or don't
   have the discipline to do that, or aren't quite sure, you must not pump.&amp;nbsp; You
   either&amp;nbsp;avoid pumping altogether or you selectively pump messages that do not
   touch the state encapsulated by the pump.&amp;nbsp; Or you lock access to state with a
   non-recursive lock and run the risk of deadlock.
&lt;/p&gt;
&lt;p&gt;
   I have found it clarifying to think&amp;nbsp;about blocking&amp;nbsp;in&amp;nbsp;event loop concurrency
   and state machine terms, advancing from one state to the next in between waits.&amp;nbsp;
   It’s a slippery model, but particularly when working in message passing systems that
   employ event loops, it can help to identify all the familiar problems with shared
   memory, blocking, and consistency.
&lt;/p&gt;
&lt;p&gt;
   Indeed it is interesting how blocking and non-blocking systems can rapidly approach
   each other.&amp;nbsp; Starting from either extreme tempts you to tiptoe closer and closer
   to the middle.&amp;nbsp; The familiarity of the other extreme tempts you.&amp;nbsp; Until,
   alas, you just might meet in the middle.
&lt;/p&gt;
&lt;img width="0" height="0" src="http://www.bluebytesoftware.com/blog/aggbug.ashx?id=53e892af-d387-4283-9b10-a463f8ac3eba" /&gt;</description>
      <category>Technology</category>
    </item>
  </channel>
</rss>