This is the documentation for CDH 4.6.0.
Documentation for other versions is available at Cloudera Documentation.

Integrating Hadoop Security with Active Directory

One of the ramifications of enabling security on a Hadoop cluster is that every user who interacts with the cluster must have a Kerberos principal configured. For organizations that use Active Directory to manage user accounts, it can be onerous to create corresponding user accounts for each user in an MIT Kerberos realm. Fortunately, it is possible to integrate Active Directory with Hadoop's security features.

To configure Hadoop to use Active Directory:

  1. Run an MIT Kerberos KDC and realm local to the cluster and create all service principals in this realm.
  2. Set up one-way cross-realm trust from this realm to the Active Directory realm. Using this method, there is no need to create service principals in Active Directory, but Active Directory principals (users) can be authenticated to Hadoop. See Configuring a Local MIT Kerberos Realm to Trust Active Directory.

Cloudera strongly recommends the method above because:

  • It requires minimal configuration in Active Directory.
  • It is comparatively easy to script the creation of many principals and keytabs. A principal and keytab must be created for every daemon in the cluster, and in a large cluster this can be extremely onerous to do directly in Active Directory.
  • There is no need to involve central Active Directory administrators in order to get service principals created.
  • It allows for incremental configuration. The Hadoop administrator can completely configure and verify the functionality the cluster independently of integrating with Active Directory.
  • It can serve to shield the corporate Active Directory server(s) from the many machines in a Hadoop cluster all requesting Kerberos tickets simultaneously. During cluster start-up, Hadoop will effectively be acting as a distributed denial of service attack on the central Active Directory server, which could adversely affect the performance of the Active Directory server.

Configuring a Local MIT Kerberos Realm to Trust Active Directory

On the Active Directory Server

  1. Type the following command to specify the local MIT KDC host name (for example, kdc-server-hostname.cluster.corp.company.com) and local realm (for example, YOUR-LOCAL-REALM.COMPANY.COM):
    ksetup /addkdc YOUR-LOCAL-REALM.COMPANY.COM kdc-server-hostname.cluster.corp.company.com
    Run this command on every domain controller that will be referenced by the cluster's krb5.conf file. If load balancing is being used and a single KDC hostname has to be provided to all domain controllers, refer the Microsoft documentation instead of explicitly using the ksetup command on individual domain controllers.
  2. Type the following command to add the local realm trust to Active Directory:
    netdom trust YOUR-LOCAL-REALM.COMPANY.COM /Domain:AD-REALM.COMPANY.COM /add /realm /passwordt:<TrustPassword>
  3. Type the following command to set the encryption type:

    On Windows 2003 RC2:

    Windows 2003 server installations do not support AES encryption for Kerberos. Therefore RC4 should be used. Please see the Microsoft reference documentation for more information.
    ktpass /MITRealmName YOUR-LOCAL-REALM.COMPANY.COM /TrustEncryp RC4
    On Windows 2008:
      Note: When using AES 256 encryption with Windows 2008 you must update the proper Java Cryptography Extension (JCE) policy files for the version of JDK you are using.
    ksetup /SetEncTypeAttr YOUR-LOCAL-REALM.COMPANY.COM <enc_type>
    Where the <enc_type> parameter can be replaced with parameter strings for AES, DES, or RC4 encryption modes. For example, for AES encryption, replace <enc_type> with AES256-CTS-HMAC-SHA1-96 or AES128-CTS-HMAC-SHA1-96 and for RC4 encryption, replace with RC4-HMAC-MD5. See the Microsoft reference documentation for more information.
      Important:

    Make sure the encryption type you specify is supported on both, your version of Windows Active Directory and your version of MIT Kerberos.

On the MIT KDC server

Type the following command in the kadmin.local or kadmin shell to add the cross-realm krbtgt principal. Use the same password you used in the netdom command on the Active Directory Server.
kadmin:  addprinc -e "<enc_type_list>" krbtgt/YOUR-LOCAL-REALM.COMPANY.COM@AD-REALM.COMPANY.COM
where the <enc_type_list> parameter specifies the types of encryption this cross-realm krbtgt principal will support: either AES, DES, or RC4 encryption. You can specify multiple encryption types using the parameter in the command above, what's important is that at least one of the encryption types corresponds to the encryption type found in the tickets granted by the KDC in the remote realm.
Examples by Active Directory server type
  • For Windows 2003:
    kadmin: addprinc -e "rc4-hmac:normal" krbtgt/YOUR-LOCAL-REALM.COMPANY.COM@AD-REALM.COMPANY.COM
  • For Windows 2008:
    kadmin: addprinc -e "aes256-cts:normal aes128-cts:normal rc4-hmac:normal" krbtgt/YOUR-LOCAL-REALM.COMPANY.COM@AD-REALM.COMPANY.COM
  Note:

The cross-realm krbtgt principal that you add in this step must have at least one entry that uses the same encryption type as the tickets that are issued by the remote KDC. If no entries have the same encryption type, then the problem you will see is that authenticating as a principal in the local realm will allow you to successfully run Hadoop commands, but authenticating as a principal in the remote realm will not allow you to run Hadoop commands.

On all of the cluster machines

  1. Verify that both Kerberos realms are configured on all of the cluster boxes. Note that the default realm and the domain realm should remain set as the MIT Kerberos realm which is local to the cluster.
    [realms]
      AD-REALM.CORP.FOO.COM = {
        kdc = ad.corp.foo.com:88
        admin_server = ad.corp.foo.com:749
        default_domain = foo.com
      }
      CLUSTER-REALM.CORP.FOO.COM = {
        kdc = cluster01.corp.foo.com:88
        admin_server = cluster01.corp.foo.com:749
        default_domain = foo.com
      }
  2. To properly translate principal names from the Active Directory realm into local names within Hadoop, you must configure the hadoop.security.auth_to_local setting in the core-site.xml file on all of the cluster machines. The following example translates all principal names with the realm AD-REALM.CORP.FOO.COM into the first component of the principal name only. It also preserves the standard translation for the default realm (the cluster realm).
    <property>
      <name>hadoop.security.auth_to_local</name>
      <value>
        RULE:[1:$1@$0](^.*@AD-REALM\.CORP\.FOO\.COM$)s/^(.*)@AD-REALM\.CORP\.FOO\.COM$/$1/g
        RULE:[2:$1@$0](^.*@AD-REALM\.CORP\.FOO\.COM$)s/^(.*)@AD-REALM\.CORP\.FOO\.COM$/$1/g
        DEFAULT
      </value>
    </property>

For more information about name mapping rules, see:  Configuring the Mapping from Kerberos Principals to Short Names