Monday, September 7, 2009

Scaling Python applications - Parallel Python easy heterogenous clustering

Python is often accused of being a slow interpreted language, inspite of all the proof of being very easy to accelerate critical sections with native code and large projects such as Youtube being written in Python. Python is a great glue language holding together disparate bits of code and providing easy interface to multiple languages, an invaluable proto-typing tool.

I write some naive inverse distance weighted interpolation for a set of field data and it ran painfully slowly (taking 1 second per interpolated point). So I looked into accelerating this with Parallel Python , this was surprisingly easy to set-up and to recode the algorithm in parallel mode. It is embarassingly parallel with the same operation being done on each grid point. Extending from 1 laptop to 7 different machines resulted in around 3 times increase in execution speed. Admittedly I ran the job over wireless and some of the machines were windows desktops with little dedicated resource while others were servers running linux. However the excercise demonstrates the flexibility of Python.

No comments: