Ibis Project Blog

Python productivity framework for the Apache Hadoop ecosystem. Development updates, use cases, and internals.

Ibis 0.8: Initial PostgreSQL support, bug fixes

Thanks to contributions from pandas core team member Philip Cloud, Ibis now has initial support for PostgreSQL. The 0.8 release also includes many bug fixes from 0.7, and all Ibis users are recommended to upgrade.

Read more in the release notes.

What is new with Ibis

Ibis development has been slower in 2016 as I've been investing energies in Apache Arrow, Apache Parquet, and other projects that are all part of the broader goal of making Python work better with Hadoop and other distributed data systems. Some of the this functionality, like the ability to read Parquet files natively in Python, will start showing up in Ibis in the relatively near future.

In the meantime, we've been stabilizing and improving the existing Ibis functionality while working with the community to bring about new features.

PostgreSQL support in Ibis

The PostgreSQL database contains vast analytical functionality, far too much to cover completely right away, but Ibis now has support for a meaningful and useful subset of Postgres's built-in functions. It is also capable of performing window functions as well as a many date, string, and mathematical operations. If you find a Postgres function you'd like to see made available in Ibis, please let us know on the GitHub issue tracker.

Here's an example of what connecting to a PostgreSQL database and executing an expression looks like:

In [7]: client = ibis.postgres.connect(host='localhost', user='ibis_test',

In [8]: client.list_tables()
Out[8]: [u'functional_alltypes']

In [9]: t = client.table('functional_alltypes')

In [10]: t.groupby('string_col').size().execute()
  string_col  count
0          8    730
1          9    730
2          2    730
3          5    730
4          3    730
5          1    730
6          7    730
7          0    730
8          4    730
9          6    730