Computational Statistics, Machine Learning, et. al.

Pandas: Installation

Pandas is available for easy_install or from source (on Github). I provide a quick guide to installation for those who might be new to Python or who want to start tinkering with existing packages. Wes provides more detail in the Pandas documentation, which is the definitive source on this subject.

Pandas has a number of dependencies. If you are using iPython, and you should, then you will already have several of these installed. [Chris Fonnesbeck (twitter), of PyMC fame, recently wrote a nice review of some of the new features.]

Getting Pandas: The Easy Way

Like most good Python packages, Pandas can be installed using <a href="">easy_install</a>, so you can just call (note: I've included a -U flag which will perform an update if it's already installed):

easy_install -U pandas

You can also install all the dependencies as needed:

easy_install -U numpy
easy_install -U scipy
easy_install -U python-dateutil
easy_install -U matplotlib
easy_install -U scikits.statsmodels
easy_install -U PyTables

These aren't all required, but it's sometimes easier just to get everything set up so that you have it there when you want it.

How to breed your own Pandas: use the source, Luke

Pandas is stored on github, so you will need a git installation to download the source code. If you're new to git then I suggest spending a little time to play with github. A good place to start is the beginner guide. There are two basic options: download the head of Wes's branch of code or fork the code into your own repository (you might do the latter if you want to contribute code back to the project). To use the primary branch, just call:

git clone git://

Pandas has several dependencies mentioned above, but the source code also depends on Cython (a nice tutorial is available from SciPy 2009). Cython allows you to compile Python-like code into C for sometimes dramatic speed improvements; it also works with Numpy.

To install Cython, you need a valid C/C++ compiler. On Debian, this is an easy operation. On Windows, the easiest option is generally to have MinGW, which will include gcc and g++. Download MinGW, and add the MinGW bin directory to your path. Then you need to edit (or create) distutils.cfg located at C:\Python26\Lib\distutils\distutils.cfg to be:


Without this, you will probably get this common error message (I say from experience):

error: Unable to find vcvarsall.bat

Once you have this working, and you are in the base source directory, you can build and install Pandas using:

python build
python install

Assuming you don't get any error messages, you should now have a working version. You will need to restart Python unless you build it "inline". Any changes you make to the source code will have to built and installed again (because of the Cython code). If you're just editing Python code itself, you can always just source it directly into your Python session to test it.

My next post will cover the basics of creating and using TimeSeries and DataFrames.

2 thoughts on “Pandas: Installation

Leave a Reply

Your email address will not be published. Required fields are marked *