Let’s Write an LLVM Specializer for Python

This is the followup to my talk LLVM Optimized Python at the Harvard-Smithsonian Center for Astrophysics, we’ll do the deep dive that I didn’t have time for. We’re going to build a single module Numba-like compiler for Python. It won’t be nearly as featureful or complete, but should demonstrate how you can go about building your own little LLVM specializer for a subset of Python or your own custom DSL expression compiler; and integrating it with the standard NumPy/SciPy stack for whatever scientific computing domain you work. The full source for this project is available on Github and comes in at 1000 lines for the whole specializer, very tiny!

There’s a whole slew of interesting domains where this kind of on-the-fly specializing compiler can be used:

Python is great for rapid development and high-level thinking, but is slow due to too many level of indirection, hashmap lookups, broken parallelism,slow garbage collector, and boxed PyObject types. With LLVM we can keep writing high-level code and not sacrafice performance.

You will need python, llvm, llvmpy, numpy and a bit of time. The best way to get all of these is to install Anaconda maintained by my good friend Ilan. Don’t add any more entropy to the universe by compiling NumPy from source, just use Anaconda.