This year, I collaborated with members of the Apache Impala (incubating) team at Cloudera to create a new C++ library to eventually become a faster, more memory-efficient replacement for impyla, PyHive, and other (largely pure Python) client libraries for talking to Apache Hive and Impala.
We are excited to release this effort, dubbed hs2client, as a new Apache-licensed open source project on GitHub. As you may guess from the name, this library implement the HiveServer2 Thrift API as a C++ library, with careful handling of result sets to allow languages like Python to access data at high performance.
The initial alpha preview release contains:
A C++ library,
libhs2client, which provides a clean C++ API for the HiveServer2 Thrift API. This can be built and dynamically or statically linked in C/C++ applications with no direct exposure to Apache Thrift.
Python bindings, with optimized reads to pandas.DataFrame
To try out the library, you can install a dev build of the project right now with conda:
conda install hs2client -c cloudera/channel/dev