Pandas is available for easy_install or from source (on Github). I provide a quick guide to installation for those who might be new to Python or who want to start tinkering with existing packages. Wes provides more detail in the Pandas documentation, which is the definitive source on this subject.
Pandas has a number of dependencies. If you are using iPython, and you should, then you will already have several of these installed. [Chris Fonnesbeck (twitter), of PyMC fame, recently wrote a nice review of some of the new features.]
Getting Pandas: The Easy Way
Like most good Python packages, Pandas can be installed using
<a href="http://peak.telecommunity.com/DevCenter/EasyInstall">easy_install</a>, so you can just call (note: I've included a -U flag which will perform an update if it's already installed):
easy_install -U pandas
You can also install all the dependencies as needed:
easy_install -U numpy easy_install -U scipy easy_install -U python-dateutil easy_install -U matplotlib easy_install -U scikits.statsmodels easy_install -U PyTables
These aren't all required, but it's sometimes easier just to get everything set up so that you have it there when you want it.
How to breed your own Pandas: use the source, Luke
Pandas is stored on github, so you will need a git installation to download the source code. If you're new to git then I suggest spending a little time to play with github. A good place to start is the beginner guide. There are two basic options: download the head of Wes's branch of code or fork the code into your own repository (you might do the latter if you want to contribute code back to the project). To use the primary branch, just call:
git clone git://github.com/wesm/pandas.git
Pandas has several dependencies mentioned above, but the source code also depends on Cython (a nice tutorial is available from SciPy 2009). Cython allows you to compile Python-like code into C for sometimes dramatic speed improvements; it also works with Numpy.
To install Cython, you need a valid C/C++ compiler. On Debian, this is an easy operation. On Windows, the easiest option is generally to have MinGW, which will include gcc and g++. Download MinGW, and add the MinGW bin directory to your path. Then you need to edit (or create)
distutils.cfg located at C:\Python26\Lib\distutils\distutils.cfg to be:
Without this, you will probably get this common error message (I say from experience):
error: Unable to find vcvarsall.bat
Once you have this working, and you are in the base source directory, you can build and install Pandas using:
python setup.py build python setup.py install
Assuming you don't get any error messages, you should now have a working version. You will need to restart Python unless you build it "inline". Any changes you make to the source code will have to built and installed again (because of the Cython code). If you're just editing Python code itself, you can always just source it directly into your Python session to test it.
My next post will cover the basics of creating and using TimeSeries and DataFrames.