This is the documentation for Cloudera Impala 1.4.0.
Documentation for other versions is available at Cloudera.com.

Enabling Kerberos Authentication for Impala

Impala supports Kerberos authentication. For more information on enabling Kerberos authentication, see the topic on Configuring Hadoop Security in the CDH4 Security Guide or the CDH 5 Security Guide.

Impala currently does not support application data wire encryption.

When using Impala in a managed environment, Cloudera Manager automatically completes Kerberos configuration. In an unmanaged environment, create a Kerberos principal for each host running impalad or statestored. Cloudera recommends using a consistent format, such as impala/_HOST@Your-Realm, but you can use any three-part Kerberos server principal.

  Note: Regardless of the authentication mechanism used, Impala always creates HDFS directories and data files owned by the same user (typically impala). To implement user-level access to different databases, tables, columns, partitions, and so on, use the Sentry authorization feature, as explained in Enabling Sentry Authorization for Impala.

Continue reading:

Requirements for Using Impala with Kerberos

On version 5 of Red Hat Enterprise Linux and comparable distributions, some additional setup is needed for the impala-shell interpreter to connect to a Kerberos-enabled Impala cluster:
sudo yum install python-devel openssl-devel python-pip
sudo pip-python install ssl
  Important:

If you plan to use Impala in your cluster, you must configure your KDC to allow tickets to be renewed, and you must configure krb5.conf to request renewable tickets. Typically, you can do this by adding the max_renewable_life setting to your realm in kdc.conf, and by adding the renew_lifetime parameter to the libdefaults section of krb5.conf. For more information about renewable tickets, see the Kerberos documentation.

Currently, you cannot use the resource management feature in CDH 5 on a cluster that has Kerberos authentication enabled.

Start all impalad and statestored daemons with the --principal and --keytab-file flags set to the principal and full path name of the keytab file containing the credentials for the principal.

Impala supports the Cloudera ODBC driver and the Kerberos interface provided. To use Kerberos through the ODBC driver, the host type must be set depending on the level of the ODBD driver:

  • SecImpala for the ODBC 1.0 driver.
  • SecBeeswax for the ODBC 1.2 driver.
  • Blank for the ODBC 2.0 driver or higher, when connecting to a secure cluster.
  • HS2NoSasl for the ODBC 2.0 driver or higher, when connecting to a non-secure cluster.

To enable Kerberos in the Impala shell, start the impala-shell command using the -k flag.

To enable Impala to work with Kerberos security on your Hadoop cluster, make sure you perform the installation and configuration steps in the topic on Configuring Hadoop Security in the CDH4 Security Guide or the CDH 5 Security Guide. Also note that when Kerberos security is enabled in Impala, a web browser that supports Kerberos HTTP SPNEGO is required to access the Impala web console (for example, Firefox, Internet Explorer, or Chrome).

If the NameNode, Secondary NameNode, DataNode, JobTracker, TaskTrackers, ResourceManager, NodeManagers, HttpFS, Oozie, Impala, or Impala statestore services are configured to use Kerberos HTTP SPNEGO authentication, and two or more of these services are running on the same host, then all of the running services must use the same HTTP principal and keytab file used for their HTTP endpoints.

Configuring Impala to Support Kerberos Security

Enabling Kerberos authentication for Impala involves steps that can be summarized as follows:

  • Creating service principals for Impala and the HTTP service. Principal names take the form: serviceName/fully.qualified.domain.name@KERBEROS.REALM
  • Creating, merging, and distributing key tab files for these principals.
  • Editing /etc/default/impala (in cluster not managed by Cloudera Manager), or editing the Security settings in the Cloudera Manager interface, to accommodate Kerberos authentication.

Enabling Kerberos for Impala

  1. Create an Impala service principal, specifying the name of the OS user that the Impala daemons run under, the fully qualified domain name of each node running impalad, and the realm name. For example:
    $ kadmin
    kadmin: addprinc -requires_preauth -randkey impala/impala_host.example.com@TEST.EXAMPLE.COM
  2. Create an HTTP service principal. For example:
    kadmin: addprinc -randkey HTTP/impala_host.example.com@TEST.EXAMPLE.COM
      Note: The HTTP component of the service principal must be uppercase as shown in the preceding example.
  3. Create keytab files with both principals. For example:
    kadmin: xst -k impala.keytab impala/impala_host.example.com
    kadmin: xst -k http.keytab HTTP/impala_host.example.com
    kadmin: quit
  4. Use ktutil to read the contents of the two keytab files and then write those contents to a new file. For example:
    $ ktutil
    ktutil: rkt impala.keytab
    ktutil: rkt http.keytab
    ktutil: wkt impala-http.keytab
    ktutil: quit
  5. (Optional) Test that credentials in the merged keytab file are valid, and that the "renew until" date is in the future. For example:
    $ klist -e -k -t impala-http.keytab
  6. Copy the impala-http.keytab file to the Impala configuration directory. Change the permissions to be only read for the file owner and change the file owner to the impala user. By default, the Impala user and group are both named impala. For example:
    $ cp impala-http.keytab /etc/impala/conf
    $ cd /etc/impala/conf
    $ chmod 400 impala-http.keytab
    $ chown impala:impala impala-http.keytab
  7. Add Kerberos options to the Impala defaults file, /etc/default/impala. Add the options for both the impalad and statestored daemons, using the IMPALA_SERVER_ARGS and IMPALA_STATE_STORE_ARGS variables. For example, you might add:
    -kerberos_reinit_interval=60
    -principal=impala_1/impala_host.example.com@TEST.EXAMPLE.COM
    -keytab_file=/var/run/cloudera-scm-agent/process/3212-impala-IMPALAD/impala.keytab

    For more information on changing the Impala defaults specified in /etc/default/impala, see Modifying Impala Startup Options.

  Note: Restart impalad and statestored for these configuration changes to take effect.

Enabling Kerberos for Impala with a Proxy Server

A common configuration for Impala with High Availability is to use a proxy server to submit requests to the actual impalad daemons on different hosts in the cluster. This configuration avoids connection problems in case of machine failure, because the proxy server can route new requests through one of the remaining hosts in the cluster. This configuration also helps with load balancing, because the additional overhead of being the "coordinator node" for each query is spread across multiple hosts.

Although you can set up a proxy server with or without Kerberos authentication, typically users set up a secure Kerberized configuration. For information about setting up a proxy server for Impala, including Kerberos-specific steps, see Using Impala through a Proxy for High Availability.

Using a Web Browser to Access a URL Protected by Kerberos HTTP SPNEGO

Your web browser must support Kerberos HTTP SPNEGO. For example, Chrome, Firefox, or Internet Explorer.

To configure Firefox to access a URL protected by Kerberos HTTP SPNEGO:

  1. Open the advanced settings Firefox configuration page by loading the about:config page.
  2. Use the Filter text box to find network.negotiate-auth.trusted-uris.
  3. Double-click the network.negotiate-auth.trusted-uris preference and enter the hostname or the domain of the web server that is protected by Kerberos HTTP SPNEGO. Separate multiple domains and hostnames with a comma.
  4. Click OK.