Thursday, August 18, 2005

Magical Concurrency Faeries or How I Learned To Stop Worrying and Love Deferreds

Every now and then, a recipe like this one pops up. Wouldn't it be wonderful to live in a world where you could close your eyes, tap your heels together, say "this API is asynchronous" three times and be rewarded with an asynchronous version of a synchronous function? I know my life would be a lot easier in that world.

We live in a world with significantly less magic than people sometimes make out. The programming languages we have available now simply cannot perform the synchronous-to-asynchronous transformation automatically for the general case. There are languages which provide interesting constructs which may be the beginnings of such a feature, but Python isn't one of them. What Python gives you in the way of "magic" asynchronous behavior is what we've had for much more than a decade: platform native, pre-emptive threads. While these can be put to some use, the programmer must understand the implications or the result is randomly buggy software.

If you haven't done much programming with threads, this consequence may not be obvious. Consider the case of a library which uses module-level storage to implement its public API. Now consider the author didn't plan to use it with threads and so didn't protect any access of this shared resource with locks. Now try and imagine what will happen if your program happens to call into this library from two different threads at about the same time. For bonus points, try to imagine how you might write a unit test which illuminated this bug in your software. An example of a library written in such a way is the Python standard library urllib module.

Asynchronous programming feels unnatural to many people. The tendency exists to want to write programs that read like they execute: top to bottom. Perhaps this tendency exists because synchronous programming is what most programmers are first taught, or perhaps it is because synchronous programming is simply a better fit for our brains; regardless of the reason behind the preference, it does not change the fact that asynchronous programs cannot be written as though they were synchronous with today's tools. If you think threads are hot, use threads and don't waste your time with tools like Twisted. If you think asynchronous programming presents serious benefits (as I do), embrace it all the way: it won't be easy (writing software rarely is), and you will often have to take existing synchronous software and rewrite it so as to present an asynchronous API. At these times, grit your teeth and get on with it. You'll be much happier in the long run than if you had tried to cobble something together using threads and end up spending a far greater amount of time tracking down non-deterministic bugs that only appear under load after days of runtime on someone else's computer and which disappear whenever you insert debug prints or run with a debugger enabled.

Oh yea, and as Itamar pointed out, the particular recipe I referenced above is, in addition to being conceptually flawed, implemented in a way which is just plain buggy. Even if you disagree with everything I said above, you shouldn't use it.

3 comments:

  1. python nowadays has the means for cooperative multitasking - either with generators or with greenlets. When carefully implemented, I think this allows cooperation between asynch and synch code without special problems. I've seen the concurrency fairy, and it is real. Or so I think. Don't try to scare me with threads. I know better.

    ReplyDelete
  2. What do you mean by "cooperation between async and sync code"? If you use generators for multitasking, all your code needs to know about this and explicitly support the context switch operation. If you use greenlets, all your code needs to use special, greenlet-ized versions of any blocking function. Dangerous hacks like lettucewrap aside, this indicates that all your code is written in an asynchronous fashion. Where is the cooepration with synchronous code?

    ReplyDelete
  3. I must apologize: I was discussing a related subject with other people, got to your article through a search, and mistook it to be a statement that it wasn't. I re-read it, and the recipe, and my mistake became clear to me.

    I'm sorry about my first comment, it was off-topic, and I suppose you're right, and code must be aware of any possible context switches.

    ReplyDelete