This is the documentation for CDH 4.7.0.
Documentation for other versions is available at Cloudera Documentation.

HiveServer2 Security Configuration

Introduction to HiveServer2 Security in CDH4

HiveServer2 supports authentication of the Thrift client using either of these methods:

  • Kerberos authentication
  • LDAP authentication

See the following sections for more information on security configurations for HiveServer2.

If Kerberos authentication is used, authentication is supported between the Thrift client and HiveServer2, and between HiveServer2 and secure HDFS. If LDAP authentication is used, authentication is supported only between the Thrift client and HiveServer2.

To configure HiveServer2 to use one of these authentication modes, you configure the hive.server2.authentication configuration property as described in the following sections.

Enabling HiveServer2 on a Kerberos-Secured Cluster

If you configure HiveServer2 to use Kerberos authentication, HiveServer2 acquires a Kerberos ticket during start-up. HiveServer2 requires a principal and keytab file specified in the configuration. The client applications (for example JDBC or beeline) must get a valid Kerberos ticket before initiating a connection to HiveServer2.

Enabling Kerberos Authentication for HiveServer2

To enable Kerberos Authentication for HiveServer2, add the following properties in the /etc/hive/conf/hive-site.xml file:

<property>
  <name>hive.server2.authentication</name>
  <value>KERBEROS</value>
</property>
<property>
  <name>hive.server2.authentication.kerberos.principal</name>
  <value>hive/_HOST@YOUR-REALM.COM</value>
</property>
<property>
  <name>hive.server2.authentication.kerberos.keytab</name>
  <value>/etc/hive/conf/hive.keytab</value>
</property>

where:

  • The _HOST@YOUR-REALM.COM value in the example above is the Kerberos principal for the host where HiveServer2 is running. The special string _HOST in the properties is replaced at run-time by the fully-qualified domain name of the host machine where the daemon is running. This requires that reverse DNS is properly working on all the hosts configured this way. Replace YOUR-REALM.COM with the name of the Kerberos realm your Hadoop cluster is in.
  • The /etc/hive/conf/hive.keytab value in the example above is a keytab file for that principal.

If you configure HiveServer2 to use both Kerberos authentication and secure impersonation, JDBC clients and Beeline can specify an alternate session user. If these clients have proxy user privileges, HiveServer2 will impersonate the alternate user instead of the one connecting. The alternate user can be specified by the JDBC connection string proxyUser=userName

Configuring JDBC Clients for Kerberos Authentication with HiveServer2

JDBC-based clients must include principal=<hive.server2.authentication.principal> in the JDBC connection string. For example:

String url = "jdbc:hive2://node1:10000/default;principal=hive/HiveServer2Host@YOUR-REALM.COM"
Connection con = DriverManager.getConnection(url);

where hive is the principal configured in hive-site.xml and HiveServer2Host is the host where HiveServer2 is running.

Using Beeline to Connect to a Secure HiveServer2

Use the following command to start beeline and connect to a secure running HiveServer2 process. In this example, the HiveServer2 process is running on localhost at port 10000:

$ /usr/lib/hive/bin/beeline
beeline> !connect jdbc:hive2://localhost:10000/default;principal=hive/HiveServer2Host@YOUR-REALM.COM
0: jdbc:hive2://localhost:10000/default>

For more information about the Beeline CLI, see Using the Beeline CLI.

Encrypted Communication with Client Drivers

With Kerberos or LDAP authentication enabled, traffic between the Hive JDBC or ODBC drivers and HiveServer2 can be encrypted which allows you to preserve data integrity (using checksums to validate message integrity) and confidentiality (by encrypting messages). This can be enabled by setting the hive.server2.thrift.sasl.qop property in hive-site.xml. For example,
<property>
<name>hive.server2.thrift.sasl.qop</name>
<value>auth</value>
<description>Sasl QOP value; one of 'auth', 'auth-int' and 'auth-conf'</description>
</property>
Valid settings for the value field are:
  • auth: Authentication only (default)
  • auth-int: Authentication with integrity protection
  • auth-conf: Authentication with confidentiality protection

Configuring Encrypted Client/Server Communication for non-Kerberos HiveServer2 Connections

For non-Kerberos connections, you can configure Secure Socket Layer (SSL) communication between HiveServer2 and clients.

  • To enable server side support, add the following configuration parameters to hive-site.xml:
    <property>
      <name>hive.server2.enable.SSL</name>
      <value>true</value>
      <description>enable/disable SSL </description>
    </property>
     
    <property>
      <name>hive.server2.keystore.path</name>
      <value>keystore-file-path</value>
      <description>path to keystore file</description>
    </property>
    
    <property>
      <name>hive.server2.keystore.password</name>
      <value>keystore-file-password</value>
      <description>keystore password</description>
    </property>
  • The keystore must contain the server's certificate.

  • The JDBC client must add the following properties in the connection URL when connecting to a HiveServer2 using SSL:
    ;ssl=true[;sslTrustStore=<Trust-Store-Path>;trustStorePassword=<Trust-Store-password>]
  • Make sure one of the following is true:
    • Either: sslTrustStore points to the trust store file containing the server's certificate; for example:
      jdbc:hive2://localhost:10000/default;ssl=true;\
      sslTrustStore=/home/usr1/ssl/trust_store.jks;trustStorePassword=xyz
      
    • or: the Trust Store arguments are set using the Java system properties javax.net.ssl.trustStore and javax.net.ssl.trustStorePassword; for example:
      java -Djavax.net.ssl.trustStore=/home/usr1/ssl/trust_store.jks -Djavax.net.ssl.trustStorePassword=xyz \
       MyClass jdbc:hive2://localhost:10000/default;ssl=true

For more information on using self-signed certificates and the Trust Store, see the Oracle Java SE keytool page.

Using LDAP Username/Password Authentication with HiveServer2

As an alternative to Kerberos authentication, you can configure HiveServer2 to use user and password validation backed by LDAP. In this case, the client sends a user name and password during the connection initiation. HiveServer2 validates these credentials using an external LDAP service.

You can enable LDAP Authentication with HiveServer2 using Active Directory or OpenLDAP.

Enabling LDAP Authentication with HiveServer2 using Active Directory

To enable the LDAP mode of authentication using Active Directory, include the following properties in the hive-site.xml file:

<property>
  <name>hive.server2.authentication</name>
  <value>LDAP</value>
</property>
<property>
  <name>hive.server2.authentication.ldap.url</name>
  <value>LDAP_URL</value>
</property>

where:

  • The LDAP_URL value is the access URL for your LDAP server. For example, ldap://ldaphost@company.com.

Enabling LDAP Authentication with HiveServer2 using OpenLDAP

To enable the LDAP mode of authentication using OpenLDAP, include the following properties in the hive-site.xml file:

<property>
  <name>hive.server2.authentication</name>
  <value>LDAP</value>
</property>
<property>
  <name>hive.server2.authentication.ldap.url</name>
  <value>LDAP_URL</value>
</property>
<property>
  <name>hive.server2.authentication.ldap.baseDN</name>
  <value>LDAP_BaseDN</value>
</property>

where:

  • The LDAP_URL value is the access URL for your LDAP server.
  • The LDAP_BaseDN value is the base LDAP DN for your LDAP server. For example, ou=People,dc=example,dc=com.

Configuring JDBC Clients for LDAP Authentication with HiveServer2

The JDBC client needs to use a connection URL as shown below. -

JDBC-based clients must include user=LDAP_Userid;password=LDAP_Password in the JDBC connection string. For example:

String url = "jdbc:hive2://node1:10000/default;user=LDAP_Userid;password=LDAP_Password"
Connection con = DriverManager.getConnection(url);

where the LDAP_Userid value is the user id and LDAP_Password is the password of the client user.

Configuring LDAPS Authentication with HiveServer2

HiveServer2 supports LDAP username/password authentication for clients. Clients send LDAP credentials to HiveServer2 which in turn verifies them with the configured LDAP provider such as OpenLDAP or Microsoft's Active Directory. Most vendors now support LDAPS (LDAP over SSL), an authentication protocol that uses SSL to encrypt communication between the LDAP service and its client (in this case, HiveServer2) to avoid sending LDAP credentials in cleartext.

Perform the following steps to configure the LDAPS service with HiveServer2:

  • Import either the LDAP server issuing Certificate Authority's SSL certificate into a local truststore, or import the SSL server certificate for a specific trust. If you import the CA certificate, HiveServer2 will trust any server with a certificate issued by the LDAP server's CA. If you only import the SSL certificate for a specific trust, HiveServer2 will trust only that server. In both cases, the SSL certificate must be imported on to the same host as HiveServer2. Please refer the keytool documentation for more details.
  • Make sure the truststore file is readable by the hive user.
  • Set the hive.server2.authentication.ldap.url configuration property in hive-site.xml to the LDAPS URL. For example, ldaps://sample.myhost.com.
      Note: The URL scheme should be ldaps and not ldap.
  • Set the environment variable HADOOP_OPTS as follows:
    HADOOP_OPTS="-Djavax.net.ssl.trustStore=<trustStore-file-path> -Djavax.net.ssl.trustStorePassword=<trustStore-password>"
    For clusters managed by Cloudera Manager, go to the Hive service and select Configuration > View and Edit. Under the HiveServer2 category, go to the Advanced section and set the HiveServer2 Environment Safety Valve property.
  • Restart HiveServer2.

Pluggable Authentication

Pluggable authentication allows you to provide a custom authentication provider for HiveServer2.

To enable pluggable authentication:

  1. Set the following properties in /etc/hive/conf/hive-site.xml:
    <property>
      <name>hive.server2.authentication</name>
      <value>CUSTOM</value>
      <description>Client authentication types.
      NONE: no authentication check
      LDAP: LDAP/AD based authentication
      KERBEROS: Kerberos/GSSAPI authentication
      CUSTOM: Custom authentication provider
      (Use with property hive.server2.custom.authentication.class)
      </description>
    </property>
    
    <property>
      <name>hive.server2.custom.authentication.class</name>
      <value>pluggable-auth-class-name</value>
      <description>
      Custom authentication class. Used when property
      'hive.server2.authentication' is set to 'CUSTOM'. Provided class
      must be a proper implementation of the interface
      org.apache.hive.service.auth.PasswdAuthenticationProvider. HiveServer2
      will call its Authenticate(user, passed) method to authenticate requests.
      The implementation may optionally extend the Hadoop's
      org.apache.hadoop.conf.Configured class to grab Hive's Configuration object.
      </description>
    </property>
  2. Make the class available in the CLASSPATH of HiveServer2.

Trusted Delegation with HiveServer2

HiveServer2 determines the identity of the connecting user from the underlying authentication subsystem (Kerberos or LDAP). Any new session started for this connection runs on behalf of this connecting user. If the server is configured to impersonate the user at the Hadoop level, then all MapReduce jobs and HDFS accesses will be performed with the identity of the connecting user. If Apache Sentry is configured, then this connecting userid can also be used to verify access rights to underlying tables, views and so on.

In CDH4.5, a connecting user (for example, hue) with Hadoop-level superuser privileges, can request an alternate user for the given session. HiveServer2 will check if the connecting user has Hadoop-level privileges to impersonate the requested userid (for example, bob). If it does, then the new session will be run on behalf of the alternate user, bob, requested by connecting user, hue.

To specify an alternate user for new connections, the JDBC client needs to add the hive.server2.proxy.user=<alternate_user_id> property to the JDBC connection URL. Note that the connecting user needs to have Hadoop-level proxy privileges over the alternate user. For example, if user hue requests access to run a session as user bob, the JDBC connection string should be as follows:
# Login as super user Hue 
kinit hue -k -t hue.keytab hue@MY-REALM.COM

# Connect using following JDBC connection string 
# jdbc:hive2://myHost.myOrg.com:10000/default;principal=hive/_HOST@MY-REALM.COM;hive.server2.proxy.user=bob

HiveServer2 Impersonation

Impersonation support in HiveServer2 allows users to execute queries and access HDFS files as the connected user rather than the super user who started the HiveServer2 daemon. Impersonation allows admins to enforce an access policy at the file level using HDFS file and directory permissions.

To enable impersonation in HiveServer2:

  1. Add the following property to the /etc/hive/conf/hive-site.xml file and set the value to true. (The default value is false.)
    <property>
      <name>hive.server2.enable.impersonation</name>
      <description>Enable user impersonation for HiveServer2</description>
      <value>true</value>
    </property>
  2. In HDFS or MapReduce configurations, add the following property to the core-site.xml file:
    <property>
      <name>hadoop.proxyuser.hive.hosts</name>
      <value>*</value>
    </property>
    <property>
      <name>hadoop.proxyuser.hive.groups</name>
      <value>*</value>
    </property>

See also File System Permissions.

Securing the Hive Metastore

To prevent users from accessing the Hive metastore and the Hive metastore database using any method other than through HiveServer2, the following actions are recommended:

  • Add a firewall rule on the metastore service host to allow access to the metastore port only from the HiveServer2 host. You can do this using iptables.
  • Grant access to the metastore database only from the metastore service host. This is specified for MySQL as:
    GRANT SELECT,INSERT,UPDATE,DELETE,CALL,LOCK TABLE ON metastore.* TO hive'@'MetaStoreHost'
    where MetaStoreHost is the host where the metastore service is running.
  • Make sure users who are not admins cannot log on to the host on which HiveServer2 runs.