Prashant MishraFollow


This disclosure describes high-performance techniques of using a client library to access data from OLAP or OLTP databases. Data is transported between the database and the client library in a columnar in-memory format such as Apache Arrow over a binary protocol such as RPC. RPC is significantly faster than REST when receiving data due to the use of protocol Buffers and HTTP/2. In addition, Apache Arrow is zero-copy and does efficient in-memory computations. The stream of data is processed in separate parser threads, thus maximizing concurrency and parallel processing. This makes the client library an order of magnitude faster than the legacy REST based implementations. The client library deserializes columnar in-memory data into rows such that callers of the client can retrieve data in a familiar backward-compatible format, e.g., java.sql.ResultSet (if the client library is written in Java), which acts as a data abstraction layer.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.