This is the documentation for CDH 4.6.0.
Documentation for other versions is available at Cloudera Documentation.

Hadoop Users in CDH

A number of special users are created by default when installing and using CDH. Given below is a list of users as of the latest CDH 4 release.

Project

User

Group

Comment

Apache Flume

flume flume

The sink that writes to HDFS as this user must have write privileges.

Apache HBase

hbase hbase

The Master and the RegionServer processes run as this user.

HDFS

hdfs hdfs

The NameNode and DataNodes run as this user, and the HDFS root directory as well as the directories used for edit logs should be owned by it.

Superusers are defined by a group named in hdfs-site.xml, dfs.permissions.superusergroup, which is the UNIX group containing users that will be treated as superusers by HDFS. You can stick with the default value of hadoop or pick your own. The hdfs, yarn, and mapred users belong to the hadoop group. To give users root privileges in HDFS, create a UNIX group with the same name as this group (or change the value of the configuration to correspond to an existing UNIX group) and add them to the group. The impala user also belongs to the hdfs group.

Apache Hive

hive hive

The HiveServer2 process and the Hive Metastore processes run as this user.

A user must be defined for Hive access to its Metastore DB (e.g. MySQL or Postgres) but it can be any identifier and does not correspond to a Unix uid. This isjavax.jdo.option.ConnectionUserName in hive-site.xml.

Apache HCatalog

hive hive

The WebHCat service (for REST access to Hive functionality) runs as the hive user. It is not configurable.

HttpFS

httpfs httpfs

The HttpFS service runs as this user.

Hue

hue hue

Hue runs as this user. It is not configurable.

Cloudera Impala

impala impala

An interactive query tool. The impala user also belongs to the hive and hdfs groups.

Apache Mahout

 

No special users

MapReduce

mapred mapred

Without Kerberos, the JobTracker and tasks run as this user. The LinuxTaskController binary is owned by this user for Kerberos. It would be complicated to use a different user ID.

Apache Oozie

oozie oozie

The Oozie service runs as this user.

Apache Pig

 

No special users

Cloudera Search

solr solr

The Solr process runs as this user. It is not configurable.

Apache Sentry (incubating)

 

No special users

Apache Sqoop

sqoop sqoop

This user is only for the Sqoop1 Metastore, a configuration option that is not recommended.

Apache Sqoop2

sqoop2 sqoop

The Sqoop2 service runs as this user.

Apache Whirr

 

No special users

YARN

yarn mapred, yarn

Without Kerberos, all YARN services and applications run as this user. The LinuxContainerExecutor binary is owned by this user for Kerberos. It would be complicated to use a different user ID.

Apache ZooKeeper

zookeeper zookeeper

The ZooKeeper process runs as this user. It is not configurable.