Scientific Programming with Python (for the debug), and C(ython) for the speed

With Cython, one can finally use Python and obtain near C-like performance. This is especially valuable if you have to start a project from scratch.

There is actually a tutorial for Numpy users which is pretty well done. Here the main steps of the cythonization process are summarized, as a teaser-guide.

If you are not convinced by the final speed one can obtain with Cython, there are some benchmarks on the web, and also the Cython profiling tutorial.

The methodology is outlined here, with a focus on the important details:

- Write your code in pure Python. It is very easy, so you can play with it, think about the best algorithm to implement your problem, etc. Use iPython for debugging (you can access and modify variables at run time after an error occurred). - Remove all the use of Python-specific data containers like sets, dicts, lists. Instead, use numpy arrays.

- Do not use vectorial operations, instead, use loops, as you would do in C/C++. This is highly UN-efficient in Numpy, but it is what you need for Cython.

Your code is now very slow, because you use loops in Python/Numpy, which is deprecated, and because the Python is overloaded with tons of background checks, and allows dynamic typing (more checks to be done at execution time).

- Cythonize it ! You have to create a setup.py file, copy-paste the bulk of your original code in some main function of some .pyx file. (This is explained very well Cython Hello World. (Hint: to indent a block of text, you can usually select it (even hundreds of lines) and hit Tab, e.g. in gedit).

Your code is now compiled in Cython, wrapped in some main function (plus additional functions if you have some). It is still very slow, because the .c file created does all the usual Python checks.

- Do $ cython -a your_pyx_file.pyx. Open the html document produced: the yellow indicates how much the Python API is called. You can click on each line to see the C implementation and the details of the PyAPI calls.

- Type all the core variables, especially the arrays you use. You can use profiling tools, but first you should try to take off all the API calls you can. When there is no more yellow in the main loop of your code, you should be fine, if your algorithm (independently form the language) is well designed.

- Deactivate the boundscheck and wraparound when debugging is over: Add \#cython: boundscheck=False \#cython: wraparound=False in your code, before any code or whitespace (but there can be some comments before).

Your code is now almost as fast as C, and you saved a lot of debugging time ! The sytax is that of Python, except for the declarations of variables (using cdef). I at some point you need to go back to a debugging-friendly environment, you can always copy-paste your pyx file in .py file, comment all the declarations, and work with the python code.

Since Cython compiles the code with the --fno-strict-aliasing option, performance is not very far from Fortran either.