Cloudera Impala: A Modern SQL Engine for Hadoop

This is a technical deep dive about Cloudera Impala, the project that makes scalable parallel database technology available to the Hadoop community for the first time.

Date: Thursday, Jan 10 2013

Description

This is a technical deep dive about Cloudera Impala, the project that makes scalable parallel databse technology available to the Hadoop community for the first time. Impala is an open-sourced code base that allows users to issue low-latency queries to data stored in HDFS and Apache HBase using familiar SQL operators.

Presenter Marcel Kornacker, creator of Impala, begins with an overview of Impala from the user's perspective, followed by an overview of Impala's architecture and implementation, and will conclude with a comparison of Impala with Dremel and Apache Hive, commercial MapReduce alternatives and traditional data warehouse infrastructure.

Next Steps