Introducing Cloudera Navigator

Cloudera Navigator is a fully integrated data management tool for the Hadoop platform. Cloudera Navigator provides data governance capabilities such as verifying access privileges and auditing access to all data stored in Hadoop. These capabilities are critical for enterprise customers that are in highly regulated industries and have stringent compliance requirements.

Cloudera Navigator tracks access permissions and actual accesses to all data objects in HDFS, Hive, HBase, and Cloudera Impala to help answer questions such as - who has access to which data object(s), which data objects were accessed by a user, when was a data object accessed and by whom, what data assets were accessed using a service, which device was used to access, and so on. Cloudera Navigator supports tracking access to:

  • HDFS data accessed through HDFS, Hive, HBase, Cloudera Impala services
  • Hive metadata

Cloudera Navigator allows administrators to configure, collect, and view audit events, to understand who accessed what data and how. Cloudera Navigator also allows administrators to generate reports that list the HDFS access permissions granted to groups.

Cloudera Navigator Architecture

The architecture of Cloudera Navigator is illustrated below.

Cloudera Navigator is implemented as an add-on to Cloudera Manager 4.5 and later; all Cloudera Navigator functions (installation, configuration, and audit log review) are accessed through the Cloudera Manager Admin Console.

When Cloudera Navigator is installed, plug-ins that enable collection of audit events are added to the HDFS, HBase, and Hive (that is, the HiveServer2 and Beeswax servers) services. When data is accessed via the HDFS, HBase, and Hive services for which auditing is enabled, audit events are generated and sent to the Navigator Server, which stores the events securely and durably in a database. The mechanism for auditing Cloudera Impala access is different than the other services. Cloudera Impala records audit events in an audit log file. When Cloudera Navigator auditing is enabled, the Cloudera Manager Agent reads the log and transfers the events to the Navigator Server, which in turn transfers them to the Navigator database.

Service Versions and Audited Operations

This section describes the service versions and audited operations supported by Cloudera Navigator.

HDFS

Minimum supported version: CDH 4.0

The captured operations are:

  • Operations that access or modify a file's or directory's data or metadata
  • Operations denied due to lack of privileges

HBase

Minimum supported version: CDH 4.0

The captured operations are:

  • Operations that require a privilege (except balance, balance switch, and append)
  • Operations denied due to lack of privileges
  Note:
  • In CDH versions less than 4.2, for grant and revoke operations, the operation in log events is ADMIN
  • In simple authentication mode, if the HBase Secure RPC Engine property is false (the default), the username in log events is UNKNOWN. To see a meaningful user name:
    1. Click the HBase service.
    2. Select Configuration > View and Edit > Service-wide > Security.
    3. Set the HBase Secure RPC Engine property to true.
    4. Save the change and restart the service.

Hive

Minimum supported versions: CDH 4.2, CDH 4.4 for operations denied due to lack of privileges

The captured operations are:

  • Operations (except grant, revoke, and metadata access only) sent to HiveServer2
  • Operations denied due to lack of privileges
  Note:
  • Access via the Hive CLI is not supported
  • In simple authentication mode, the username in log events is the username passed in the HiveServer2 connect command. If you do not pass a username in the connect command, the username is log events is anonymous.

Hue

Minimum supported versions: CDH 4.2

The captured operations are:

  • Operations (except grant, revoke, and metadata access only) sent to Beeswax Server
  Note: You do not directly configure the Hue service for auditing. Instead, when you configure the Hive service for auditing, operations sent to the Hive service through Beeswax appear in the Hue service audit log.

Cloudera Impala

Minimum supported versions: CDH 4.4 and Cloudera Impala 1.1.1.

The captured operations are:

  • Queries denied due to lack of privileges
  • Queries that pass analysis