Getting started
What is it?
Pandas is a Python package that provides fast, flexible, and expressive data structures designed to make working with "relational" and "labeled" data both easy and intutive. It aims to be the fundamental high-level building block for doing practical, real world data analyst in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analyst / manipulation tool available in any language. It is already well on its way towards this goal.
Main Features
Here are just a few of the things that pandas does well:
- Easy handling of missing data (represented as
Nan
,NA
, orNaT
) in floating point as well as non-floating point data. - Size mutability: columns can be inserted and deleted from DataFrame and higher dimensional objects.
- Automatic and explicit data alignment: objects can be explicitly aligned to a set of lables, or the user can simply ignore the labels and ler
Series
,DataFrame
, etc. Automatically align the data for you in computations. - Powerful, flexible group by functionality to perform split-apply-combine operations on data sets, for both aggregating and transforming data.
- Make it easy to convert ragged, differently-indexed data in other Python and Numpy data structures into DataFrame objects.
- Intelligent label-based slicing, fancy indexing, and subsetting or large data sets.
- Intuitive merging and joining data sets.
- Flexible reshaping and pivoting of data sets.
- Hierachical labeling of axes (possible to have multiple labels per tick).
- Robust IO tools for loading data from flat files (CSV and delimited), Excel files, database, and saving/loading data from the ultrafast HDF5 format.
- Time series-specific functionality: date range generation and frequency conversion, moving window statistics, date shifting and lagging.
Installation
The source code is currently hosted on Github at: https://github.com/pandas-dev/pandas
Working with Conda? |
---|
Conda is part of the Anaconda distribution and can beinstalled with Anaconda or Miniconda: |
conda install pandas
Prefer Pip? |
---|
Pandas can be installed via pip from PyPi: |
pip install pandas
Installation from source
To install pandas from source you need Cython in addition to the normal dependentcies above. Cython can be installed from PyPi:
pip install cython
In the pandas
directory (same one where you found this file after cloning the git repo), execute:
python setup.py install
or for installing in development mode
python -m pip install -e . --no-build-isolation --no-use-pep517
or alternatively
python setup.py delvelop
see full instructions for installing from source.
Documentation
The official documentation is hosted on PyData.org: https://pandas.pydata.org/pandas-docs/stable/