HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase

A schema management methodology based on Apache Avro that enables users and applications to share data in HBase in a scalable, evolvable fashion.

Date: Monday, Jun 18 2012

Description

In this talk, we describe a schema management methodology based on Apache Avro that enables users and applications to share data in HBase in a scalable, evolvable fashion. By adopting these practices, engineers independently using the same data have guarantees on how their applications interact._ As data collection needs change, applications are resilient to drift in the underlying data representation. This methodology results in a data dictionary that allows less-technical users to understand what data is available to them for analysis and inspect data using general-purpose tools (for example, export it via Sqoop to an RDBMS). And because of Avro’s cross-language capabilities, HBase’s power can reach new domains, like web apps built in Ruby.