This is the documentation for CDH 4.7.0.
Documentation for other versions is available at Cloudera Documentation.

Hue Configuration

This section describes configuration you perform in the Hue configuration file hue.ini. The location of the Hue configuration file varies depending on how Hue is installed. The location of the configuration file is displayed when you view the Hue configuration.

  Note: Only the root user can edit the Hue configuration file.

Viewing the Hue Configuration

  Note: You must be a Hue superuser to view the Hue configuration.

When Hue detects an invalid configuration it displays the icon on the top navigation bar. In addition, the Check Configuration page of the Quick Start wizard displays with information about misconfigured properties.

To view the Hue configuration, do one of the following:

  • Click and click the Configuration tab.
  • Visit http://myserver:port/dump_config.

You can configure the Hue apps using the properties described in the sections:

Hue Server Configuration

This section describes Hue Server settings.

Specifying the Hue Server HTTP Address

These configuration properties are under the [desktop] section in the Hue configuration file.

Hue includes two web servers, the CherryPy web server and the Spawning web server (configurable). You can use the following options to change the IP address and port that the web server listens on. The default setting is port 8888 on all configured IP addresses.
# Webserver listens on this address and port 
http_host=0.0.0.0 
http_port=8888
Hue defaults to using the Spawning web server, which is necessary for the Shell application. To revert to the CherryPy web server, use the following setting in the Hue configuration file:
use_cherrypy_server=true

Setting this to false causes Hue to use the Spawning web server.

Specifying the Secret Key

For security, you should specify the secret key that is used for secure hashing in the session store:

  1. Open the Hue configuration file.
  2. In the [desktop] section, set the secret_key property to a long series of random characters (30 to 60 characters is recommended). For example,
    secret_key=qpbdxoewsqlkhztybvfidtvwekftusgdlofbcfghaswuicmqp
      Note: If you don't specify a secret key, your session cookies will not be secure. Hue will run but it will also display error messages telling you to set the secret key.

Authentication

By default, the first user who logs in to Hue can choose any username and password and automatically becomes an administrator. This user can create other user and administrator accounts. Hue users should correspond to the Linux users who will use Hue; make sure you use the same name as the Linux username.

By default, user information is stored in the Hue database. However, the authentication system is pluggable. You can configure authentication to use an LDAP directory (Active Directory or OpenLDAP) to perform the authentication, or you can import users and groups from an LDAP directory. See Configuring an LDAP Server for User Admin.

For more information, see Hue SDK.

Configuring the Hue Server for SSL

You can optionally configure Hue to serve over HTTPS. pyOpenSSL is now part of the Hue build and does not need to be installed manually. To configure SSL, perform the following steps from the root of your Hue installation path:

  1. Configure Hue to use your private key by adding the following options to the Hue configuration file:
    ssl_certificate=/path/to/certificate
    ssl_private_key=/path/to/key
  2. On a production system, you should have an appropriate key signed by a well-known Certificate Authority. If you're just testing, you can create a self-signed key using the openssl command that may be installed on your system:
    # Create a key 
    $ openssl genrsa 1024 > host.key 
    # Create a self-signed certificate 
    $ openssl req -new -x509 -nodes -sha1 -key host.key > host.cert
      Note: Uploading files using the Hue File Browser over HTTPS requires using a proper SSL Certificate. Self-signed certificates don't work.

Beeswax Configuration

In the [beeswax] section of the configuration file, you can optionally specify the following:

beeswax_server_host

The hostname or IP address of the Hive server.

Default: localhost, and therefore only serves local IPC clients.

beeswax_server_port

The port of the Hive server.

If server_interface is set to hiveserver2, this should be set to the port that HiveServer2 is running on, which defaults to 10000.

Default: 8002.

hive_home_dir

The base directory of the Hive installation.

hive_conf_dir

The directory containing the hive-site.xml Hive configuration file.

beeswax_server_heapsize

The heap size (-Xmx) of the Beeswax Server.

server_interface

The type of the Hive server that the application uses:
  • beeswax
  • hiveserver2
Default: beeswax.

By default, Beeswax allows any user to see the saved queries of all other Beeswax users. You can restrict this by changing the setting the following property:

share_saved_queries

Set to false to restrict viewing of saved queries to the owner of the query or an administrator.

Cloudera Impala Query UI Configuration

In the [impala] section of the configuration file, you can optionally specify the following:

server_host

The hostname or IP address of the Impala Server.

Default: localhost.

server_port

The port of the Impalad Server.

Default: When using the beeswax interface, 21000. When using the HiveServer2 interface, 21050.

server_interface

The type of interface to use to communicate with Impalad Server:
  • beeswax
  • hiveserver2
Default: hiveserver2.

Pig Editor Configuration

In the [pig] section of the configuration file, you can optionally specify the following:

remote_data_dir

Location on HDFS where the Pig examples are stored.

Sqoop Configuration

In the [sqoop] section of the configuration file, you can optionally specify the following:

server_url

The URL of the sqoop2 server.

Job Browser Configuration

By default, any user can see submitted job information for all users. You can restrict viewing of submitted job information by optionally setting the following property under the [jobbrowser] section in the Hue configuration file:

share_jobs

Indicate that jobs should be shared with all users. If set to false, they will be visible only to the owner and administrators.

Job Designer

In the [jobsub] section of the configuration file, you can optionally specify the following:

remote_data_dir

Location in HDFS where the Job Designer examples and templates are stored.

Oozie Editor/Dashboard Configuration

By default, any user can see all workflows, coordinators, and bundles. You can restrict viewing of workflows, coordinators, and bundles by optionally specifying the following property under the [oozie] section of the Hue configuration file:

share_jobs

Indicate that workflows, coordinators, and bundles should be shared with all users. If set to false, they will be visible only to the owner and administrators.

oozie_jobs_count

Maximum number of Oozie workflows or coordinators or bundles to retrieve in one API call.

remote_data_dir

The location in HDFS where Oozie workflows are stored.

Also see Liboozie Configuration

Search Configuration

In the [search] section of the configuration file, you can optionally specify the following:

security_enabled

Indicate whether Solr requires clients to perform Kerberos authentication.

empty_query

Query sent when no term is entered.

Default: *:*.

solr_url

URL of the Solr server.

Hue Shell Configuration

Hue includes the Shell application, which provides access to the Pig, HBase, and Sqoop 2 command-line shells. The Shell application is designed to have the same look and feel as a Unix terminal. In addition to the shells configured by default, it is possible to include almost any process that exposes a command-line interface as an option in this Hue application.

  Note: Flume 1.x does not provide a shell. However if you have Flume 0.9.x installed, you can access its shell through the Hue Shell application.
  Note: Pig Shell will not run unless you configure the JAVA_HOME environment variable in the Hue configuration file. For instructions, see Hue Shell Configuration Properties.

Verifying Command-Line Shell Installations

To work properly, Hue Shell requires the command-line shells to be installed and available on the system. You must specify absolute paths, so you should test exactly the commands you will enter in the configuration file in a terminal. For example:

To verify that the Pig Shell (Grunt) is installed:

$ /usr/bin/pig 

To verify that the HBase Shell is installed:

$ /usr/bin/hbase shell

To verify that the Sqoop 2 Shell is installed:

$ /usr/bin/sqoop2 

Hue Shell Configuration Properties

To add or remove shells, modify the Hue Shell configuration in the [shell] section of the Hue configuration file:

shell_buffer_amount

Optional. Amount of output to buffer for each shell in bytes. Defaults to 524288 (512 KiB) if not specified.

shell_timeout

Optional. Amount of time to keep shell subprocesses open when no open browsers refer to them. Defaults to 600 seconds if not specified.

shell_write_buffer_limit

Optional. Amount of pending commands to buffer for each shell, in bytes. Defaults to 10000 (10 KB) if not specified.

shell_os_read_amount

Optional. Number of bytes to specify to the read system call when reading subprocess output. Defaults to 40960 (usually 10 pages, since pages are usually 4096 bytes) if not specified.

shell_delegation_token_dir

Optional. If this instance of Hue is running with a Hadoop cluster with Kerberos security enabled, it must acquire the appropriate delegation tokens to execute subprocesses securely. The value under this key specifies the directory in which these delegation tokens are to be stored. Defaults to /tmp/hue_shell_delegation_tokens if not specified.

[[ shelltypes ]]

This subsection groups the individual shell configurations.

[[[ pig ]]]

This section title is a key name that also begins the configuration parameters for a specific shell type ("pig" in this example). You can use any name, but it must be unique for each shell specified in the configuration file. Each key name denotes the beginning of a shell configuration section; each section can contain the following six parameters described in this table: nice_name, command, help, environment, and the value for the environment variable.

nice_name = "Pig Shell (Grunt)"

The user-facing name.

command = "/usr/bin/pig -l /dev/null"

The command to run to start the specified shell. The path to the binary must be an absolute path.

help = "A platform for data exploration."

Optional. A string that describes the shell.

[[[[ environment ]]]]

Optional. A section to specify environment variables to be set for subprocesses of this shell type. Each environment variable is itself another sub-section, as described below.

[[[[[ JAVA_HOME ]]]]]

The name of the environment variable to set. For example, Pig requires JAVA_HOME to be set.

value = /usr/lib/jvm/java-6-sun

The value for the environment variable.

Restrictions

While almost any process that exports a command-line interface can be included in the Shell application, processes that redraw the window, such as vim or top, cannot be exposed in this way.

Unix User Accounts

To properly isolate subprocesses so as to guarantee security, each Hue user who is using the Shell subprocess must have a Unix user account. The link between Hue users and Unix user accounts is the username, and so every Hue user who wants to use the Shell application must have a Unix user account with the same name on the server that runs Hue.

Also, there is a binary called setuid which provides a binary wrapper, allowing for the subprocess to be run as the appropriate user. In order to work properly for all users of Hue, this binary must be owned by root and must have the setuid bit set.

To make sure that these two requirements are satisfied, navigate to the directory with the setuid binary (apps/shell/src/shell/build) and execute one of the following commands in a terminal:
OS Command

sudo

$ sudo chown root:hue setuid 
$ sudo chmod 4750 setuid 
$ exit 

root

$ su 
# chown root:hue setuid 
# chmod 4750 setuid 
# exit 
  Important: If you are running Hue Shell against a secure cluster, see Running Hue Shell against a Secure Clusterfor security configuration information for Hue Shell.

Hue Server Configuration

Older versions of Hue shipped with the CherryPy web server as the default Hue server. This is no longer the case starting with CDH3 Update 1. In order to configure the default Hue server, you must modify the Hue configuration file and modify the value for use_cherrypy_server. This value must either be set to false or not specified in order for the Shell application to work.

HBase Configuration

In the [hbase] section of the configuration file, you can optionally specify the following:

truncate_limit

Hard limit of rows or columns per row fetched before truncating.

Default: 500

hbase_clusters

Comma-separated list of HBase Thrift servers for clusters in the format of "(name|host:port)".

Default: (Cluster|localhost:9090)

User Admin Configuration

In the [useradmin] section of the configuration file, you can optionally specify the following:

default_user_group

The name of the group to which a manually created user is automatically assigned.

Default: default.

Configuring an LDAP Server for User Admin

User Admin can interact with an LDAP server, such as Active Directory, in one of two ways:

  • You can import user and group information from your current Active Directory infrastructure using the LDAP Import feature in the User Admin application. User authentication is then performed by User Admin based on the imported user and password information. You can then manage the imported users, along with any users you create directly in User Admin. See Enabling Import of Users and Groups from an LDAP Directory.
  • You can configure User Admin to use an LDAP server as the authentication back end, which means users logging in to Hue will authenticate to the LDAP server, rather than against a username and password kept in User Admin. In this scenario, your users must all reside in the LDAP directory. See Enabling the LDAP Server for User Authentication for further information.

Enabling Import of Users and Groups from an LDAP Directory

User Admin can import users and groups from an Active Directory via the Lightweight Directory Authentication Protocol (LDAP). In order to use this feature, you must configure User Admin with a set of LDAP settings in the Hue configuration file.

  Note: If you import users from LDAP, you must set passwords for them manually; password information is not imported.

To enable LDAP import of users and groups:

  1. In the Hue configuration file, configure the following properties in the [[ldap]] section:

    Property

    Description

    Example

    base_dn

    The search base for finding users and groups.

    base_dn="DC=mycompany,DC=com"

    nt_domain

    The NT domain to connect to (only for use with Active Directory).

    nt_domain=mycompany.com

    ldap_url

    URL of the LDAP server.

    ldap_url=ldap://auth.mycompany.com

    ldap_cert

    Path to certificate for authentication over TLS (optional).

    ldap_cert=/mycertsdir/myTLScert

    bind_dn

    Distinguished name of the user to bind as – not necessary if the LDAP server supports anonymous searches.

    bind_dn="CN=ServiceAccount,DC=mycompany,DC=com"

    bind_password

    Password of the bind user – not necessary if the LDAP server supports anonymous searches.

    bind_password=P@ssw0rd

  2. Configure the following properties in the [[[users]]] section:

    Property

    Description

    Example

    user_filter

    Base filter for searching for users.

    user_filter="objectclass=*"

    user_name_attr

    The username attribute in the LDAP schema.

    user_name_attr=sAMAccountName

  3. Configure the following properties in the [[[groups]]] section:

    Property

    Description

    Example

    group_filter

    Base filter for searching for groups.

    group_filter="objectclass=*"

    group_name_attr

    The username attribute in the LDAP schema.

    group_name_attr=cn

  Note: If you provide a TLS certificate, it must be signed by a Certificate Authority that is trusted by the LDAP server.

Enabling the LDAP Server for User Authentication

You can configure User Admin to use an LDAP server as the authentication back end, which means users logging in to Hue will authenticate to the LDAP server, rather than against usernames and passwords managed by User Admin.

  Important: Be aware that when you enable the LDAP back end for user authentication, user authentication by User Admin will be disabled. This means there will be no superuser accounts to log into Hue unless you take one of the following actions:
  • Import one or more superuser accounts from Active Directory and assign them superuser permission.
  • If you have already enabled the LDAP authentication back end, log into Hue using the LDAP back end, which will create a LDAP user. Then disable the LDAP authentication back end and use User Admin to give the superuser permission to the new LDAP user.
After assigning the superuser permission, enable the LDAP authentication back end.

To enable the LDAP server for user authentication:

  1. In the Hue configuration file, configure the following properties in the [[ldap]] section:

    Property

    Description

    Example

    ldap_url

    URL of the LDAP server, prefixed by ldap:// or ldaps://

    ldap_url=ldap://auth.mycompany.com

    search_bind_authentication

    Search bind authentication is disabled by default. To enable search bind authentication, the value of this property should be set to true. When using search bind, Hue will ignore the following nt_domain and ldap_username_pattern properties and discover Distinguished Names with LDAP search using the user_name_attr property.

    search_bind_authentication=true 

    nt_domain

    The NT domain over which the user connects (not strictly necessary if using ldap_username_pattern).

    nt_domain=mycompany.com

    ldap_username_pattern

    Pattern for searching for usernames – Use <username> for the username parameter. For use when using LdapBackend for Hue authentication

    ldap_username_pattern=
    "uid=<username>,ou=People,dc=mycompany,dc=com"
  2. If you are using TLS or secure ports, add the following property to specify the path to a TLS certificate file:

    Property

    Description

    Example

    ldap_cert

    Path to certificate for authentication over TLS.

      Note: If you provide a TLS certificate, it must be signed by a Certificate Authority that is trusted by the LDAP server.
    ldap_cert=/mycertsdir/myTLScert
  3. In the[[auth]] sub-section inside [desktop] change the following:

    backend

    Change the setting of backend from
    backend=desktop.auth.backend.AllowFirstUserDjangoBackend
    to
    backend=desktop.auth.backend.LdapBackend

Hadoop Configuration

The following configuration variables are under the [hadoop] section in the Hue configuration file.

HDFS Cluster Configuration

Hue currently supports only one HDFS cluster, which you define under the [[hdfs_clusters]] sub-section. The following properties are supported:

[[[default]]]

The section containing the default settings.

fs_defaultfs

The equivalent of fs.defaultFS (also referred to as fs.default.name) in a Hadoop configuration.

webhdfs_url

The HttpFS URL. The default value is the HTTP port on the NameNode.

hadoop_hdfs_home

The home of your Hadoop HDFS installation. It is the root of the Hadoop untarred directory, or usually /usr/lib/hadoop-hdfs.

hadoop_bin

The HDFS Hadoop launcher script, which is usually /usr/bin/hadoop.

hadoop_conf_dir

The configuration directory of HDFS, typically /etc/hadoop/conf.

MapReduce (MRv1) and YARN (MRv2) Cluster Configuration

Job Browser can display both MRv1 and MRv2 jobs, but must be configured to display one type at a time by specifying either [[mapred_clusters]] or [[yarn_clusters]] sections in the Hue configuration file.

The following MapReduce cluster properties are defined under the [[mapred_clusters]] sub-section:

[[[default]]]

The section containing the default settings.

jobtracker_host

The fully-qualified domain name of the host running the JobTracker.

jobtracker_port

The port for the JobTracker IPC service.

submit_to

If your Oozie is configured with to use a 0.20 MapReduce service, then set this to true. Indicate that Hue should submit jobs to this MapReduce cluster.

hadoop_mapred_home

The home directory of the Hadoop MapReduce installation. For CDH packages, the root of the Hadoop MRv1 untarred directory, the root of the Hadoop 2.0 untarred directory, /usr/lib/hadoop-mapreduce (for MRv1). If submit_to is true, the $HADOOP_MAPRED_HOME for the Beeswax Server and child shell processes.

hadoop_bin

The MRv1 Hadoop launcher script, usually /usr/bin/hadoop.

hadoop_conf_dir

The configuration directory of the MRv1 service, typically /etc/hadoop/conf.

The following YARN cluster properties are defined under the under the [[yarn_clusters]] sub-section:

[[[default]]]

The section containing the default settings.

resourcemanager_host

The fully-qualified domain name of the host running the ResourceManager.

resourcemanager_port

The port for the ResourceManager IPC service.

submit_to

If your Oozie is configured to use a YARN cluster, then set this to true. Indicate that Hue should submit jobs to this YARN cluster.

hadoop_mapred_home

The home of the Hadoop MapReduce installation. For CDH packages, the root of the Hadoop 2.0 untarred directory, /usr/lib/hadoop-mapreduce (for YARN). If submit_to is true, the $HADOOP_MAPRED_HOME for the Beeswax Server and child shell processes.

hadoop_bin

The YARN Hadoop launcher script, usually /usr/bin/hadoop.

hadoop_conf_dir

The configuration directory of the YARN service, typically /etc/hadoop/conf.

Liboozie Configuration

In the [liboozie] section of the configuration file, you can optionally specify the following:

security_enabled

Indicate whether Oozie requires clients to perform Kerberos authentication.

remote_deployement_dir

The location in HDFS where the workflows and coordinators are deployed when submitted by a non-owner.

oozie_url

The URL of the Oozie server.