Saturday, November 05, 2005

Ride the Snake

Python Code


When STScI first came out with their PyRAF software package, I was a little skeptical. Yes, the scripting language provided by IRAF was primitive and painful to use, so marrying IRAF to a decent scripting language is a great idea. However, Python seemed a little weird, a little new. Why not go with a more common and mature language like perl, which is installed on virtually every Unix box on the planet. However, the more I've used Python/PyRAF, the more I'm convinced it was an inspired choice.

Python does have a few quirks. I'm still not a big fan of using white space to indicate code structure, but I must admit I'm rarely bitten by it. Of course, I only ever work on my own Python code. If I was working with others I can see this might be a bit more of a problem. Especially as the standard solution in the Python community is ``never use tabs'' and I happen to like using tabs, (even worse, I like using 2 space tabs instead of the standard 4).

Anyway, where Python really shows it's strength is it's OOP-ness. Python is actually a completely object-oriented language, even though it doesn't always look like it. Unlike perl which tacks on objects an an afterthought, Python is OOP to the core; but it still retains that marvelous sloppiness of a scripting language which makes it a great tool for doing some pretty flash things rather easily.

For making short little scripts, the object-ness of Python is not particularly useful, but it's not particularly invasive either. However, as I've stepped up to coding more complicated things including my current task, a full-blown data pipeline, I've found objects more and more useful. The thing is that objects are a great way to deal with metadata without having to pass enormous numbers of parameters between tasks. Keeping track of the metadata is extremely useful in pipelines, where you need to mix lots of data together in different ways to get to your final data products. By making objects that act as code analogues to the various logical combinations of data the relationship between data sets is supplied naturally.

Python even has a built-in module which saves and retrieves object instances, which can be a little tricky, and is certainly tedious if you have to code it for each object type by hand. In Python it's a snap, you just dump your object instance into the `pickle' module and it saves it to disk in some magic appropriate manner, and will magically read it again later, handing you back an instance just as if you'd done nothing special with it. It's so easy it almost feels like cheating.

No comments: