Wednesday, March 9, 2011

python's only problem: concurrency

python has its naysayers, but that's what jealousy will do to you :) seriously though, python does have one major issue: concurrent/parallel programming. now I'm no engineer, and I'm sure this is an oversimplification, but three ways of accomplishing this are:


  1. using multiple threads

  2. using multiple processes

  3. event-based programming



the main killer out of these three is trying to do multi-threaded programming in python. the short answer: don't. when Python was designed, it was given the philosophy that people using it should have to worry about as little as possible under the hood. while that's great, the way the python threading library was designed makes it inefficient for CPU-bound tasks. essentially, the more threads you use in python for CPU-bound tasks, the slower your code will be. ironically the problem seems to be even worse on machines with multiple processors. the problem has a name: the Global Interpreter Lock (GIL). you can read more about it here:

http://wiki.python.org/moin/GlobalInterpreterLock

there's also a great presentation that explains the problem pretty well if you have time to watch it. I'll put it at the bottom of this post.

then there's the issue of event-based programming. the problem in this arena (in my opinion, of course) is that there are too many choices for event-based programming, and none of them are included in the python standard library, so there's no standard as of yet.

lastly there's multi-process programming, which thankfully is in a better place than the other two. although it's only been fairly recently (within the last couple of years), python now has a multiprocessing module in the standard library. so if you're looking to write some concurrent/parallel code, I'd say start with here before wading through the alternatives that aren't in the standard library:

http://docs.python.org/library/multiprocessing.html

as the speed of computing becomes increasingly more about multiple processors than faster clock speed, the more important this issue will be. thankfully there's a standard solution (the multiprocessing module) as well as plenty of alternatives, and hopefully in the near future one or two of these alternatives will become standard, if not at least de facto.

here's a page that has links to some resources related to concurrency in python, as well as links to plenty of the afore-mentioned alternatives:

http://wiki.python.org/moin/Concurrency

and here's that video I promised:
[blip.tv ?posts_id=2243379&dest=-1]

0 comments:

Post a Comment