# Scientific Computing in Python

December 10, 2007

Recently there has been a lot of momentum behind Python, which seems to have
a *very* active scientific computing community which places a lot of emphasis
on efficiency. I spent much of the weekend reading about various Python
packages and tools for numerical and mathematical work and my curiosity is
piqued. I’ll try to summarize some of the things I’ve discovered.

My primary reason for considering a “scripting language” over a “compiled
language” is that such languages tend to result in shorter programs, less
error-prone code, and thus a decrease in development time. A second concern
is the need for *completely* reproducible research. That is, there should be
a way for another researcher to completely replicate what you have done from
start to finish. It seems to me that this is most easily accomplished
through open source tools and libraries which are usually widely-available
for many different platforms.

## SciPy

SciPy is the center of the scientific Python community. The two primary libraries are NumPy and SciPy. NumPy is the foundation, providing a multidimensional array object, linear algebra functions, and random number generators. SciPy is built around NumPy and contains modules for optimization, integration, interpolation, etc.

In this brief example from the SciPy Tutorial, you can see that at a basic level, the code is no more complicated than Matlab:

```
from numpy import matrix
from scipy.linalg import inv, det, eig
A = matrix([[1,1,1],[4,4,3],[7,8,5]]) # 3 lines 3 rows
b = matrix([1,2,1]).transpose() # 3 lines 1 rows.
print det(A) # Determinant of A
print inv(A)*b # The solution of the linear system Ax=b.
print eig(A) # Eigenvalues and eigenvectors of A
```

Of course, being a fully functional programming language, it has many much more powerful features.

On Ubuntu or Debian Linux, installing SciPy and NumPy is as simple as:

```
sudo apt-get install python2.4 python-scipy python-numpy
```

At first I had a few questions regarding the implementation and performance of the various SciPy algorithms, but the FAQ quickly answered them:

How can SciPy be fast if it is written in an interpreted language like Python?Actually, the time-critical loops are usually implemented in C or Fortran. Much of SciPy is a thin layer of code on top of the scientific routines that are freely available at <www.netlib.org>. Netlib is a huge repository of incredibly valuable and robust scientific algorithms written in C and Fortran.

Source:SciPy FAQ

So, many of the algorithms are implemented as wrappers to existing libraries such as LAPACK and the GSL. Also, in practice it is very easy to include inline Fortran or C++ code using the Weave or f2py packages.

## OpenOpt

OpenOpt is a free Python-based alternative to commercial frameworks like
AMPL and GAMS. It provides a unified framework for solving many
types of problems through the use of various solvers. Here is
an example from the website which optimizes a simple nonlinear objective
function $f(x)=(x-1{)}^{2}$ starting at ${x}_{0}=4$ using the `ralg`

solver:

```
from openopt import NLP
p = NLP(lambda x: (x-1)**2, 4)
r = p.solve('ralg')
```

You only need to choose a solver which is appropriate for your problem, based on
the class of problems it belongs to (e.g., nonlinear minimization or linear
least-squares problems) and the kind of constraints you need to impose. Some
of the solvers, such as `ralg`

, were written by the OpenOpt developers. You
can also use solvers from SciPy, such as `scipy.optimize.fmin_bfgs`

, or
LAPACK, such as `lapack_dgelss`

for linear least-squares problems. The latest
OpenOpt news can be found in the development forum.

## PyMC

PyMC is a Python module for conducting Bayesian estimation through Markov Chain Monte Carlo (MCMC) sampling. It implements the Metropolis Hastings algorithm as a Python class and provides routines to assist in generating plots and summary statistics. The author has written a very nice users guide which also provides a good amount of theory along with usage information.

## The SAGE project

The SAGE project aims to be a sort of open source Matlab-Mathematica hybrid. Instead of rewriting every algorithm under the sun, it provides a unified interface to various open source programs such as Octave and Maxima alongside commercial offerings such as Matlab and Mathematica. It’s language is essentially Python code so it is much more versatile than your typical math environment. There are several options for interacting with SAGE, including a web based interface called the Sage Notebook, an interactive shell, or by generating compiled code via cython. There is also an Emacs mode.

## Additional Resources

- NumPy Tutorial
- SciPy Course
- matplotlib for plotting