This is the documentation for Cloudera Search CDH 5 Beta 2 and 1.2.0 for CDH 4.
Documentation for other versions is available at Cloudera Documentation.

Configuring Sentry for Search

Sentry enables role-based, fine-grained authorization for Cloudera Search. Follow the instructions below to configure Sentry under CDH 4.5 or later or CDH 5. Sentry is included in the Search installation.
  Note: Sentry for Search depends on Kerberos authentication. For additional information on using Kerberos with Search, see Configuring Search to Use Kerberos and Using Kerberos.

Note that this document is for configuring Sentry for Cloudera Search. To download or install other versions of Sentry for other services, see:

Roles and Privileges

Sentry uses a role-based privilege model. A role is a set of rules for accessing a given Solr collection. Access to each collection is governed by privileges: Query, Update, or All (*).

For example, a rule for the Query privilege on collection logs would be formulated as follows:
collection=logs->action=Query
A role can contain multiple such rules, separated by commas. For example the engineer_role might contain the Query privilege for hive_logs and hbase_logs collections, and the Update privilege for the current_bugs collection. You would specify this as follows:
engineer_role = collection=hive_logs->action=Query, \
  collection=hbase_logs->action=Query, \
  collection=current_bugs->action=Update

Users and Groups

  • A user is an entity that is permitted by the Kerberos authentication system to access the Search service.
  • A group connects the authentication system with the authorization system. It is a set of one or more users who have been granted one or more authorization roles. Sentry allows a set of roles to be configured for a group.
  • A configured group provider determines a user’s affiliation with a group. The current release supports HDFS-backed groups and locally configured groups. For example,
    dev_ops = dev_role, ops_role

Here the group dev_ops is granted the roles dev_role and ops_role. The members of this group can complete searches that are allowed by these roles.

User to Group Mapping

You can configure Sentry to use either Hadoop groups or groups defined in the policy file.

  Important: You can use either Hadoop groups or local groups, but not both at the same time. Use local groups if you want to do a quick proof-of-concept. For production, use Hadoop groups.

To configure Hadoop groups:

Set the sentry.provider property in sentry-site.xml to org.apache.sentry.provider.file.HadoopGroupResourceAuthorizationProvider.
  Note: Note that, by default, this uses local shell groups. See the Group Mapping section of the HDFS Permissions Guide for more information.

OR

To configure local groups:

  1. Define local groups in a [users] section of the Sentry Configuration File, sentry-site.xml. For example:
    [users]
    user1 = group1, group2, group3
    user2 = group2, group3
  2. In sentry-site.xml, set search.sentry.provider as follows:
    <property>
        <name>sentry.provider</name>
        <value>org.apache.sentry.provider.file.LocalGroupResourceAuthorizationProvider</value>
      </property>
    

Setup and Configuration

This release of Sentry stores the configuration as well as privilege policies in files. The sentry-site.xml file contains configuration options such as privilege policy file location. The Policy file contains the privileges and groups. It has a .ini file format and should be stored on HDFS.

Sentry is automatically installed when you install Cloudera Search for CDH or Cloudera Search 1.1.0 or later.

Policy file

The sections that follow contain notes on creating and maintaining the policy file.

  Warning: An invalid configuration disables all authorization while logging an exception.

Storing the Policy File

Considerations for storing the policy file(s) include:

  1. Replication count - Because the file is read for each query, you should increase this; 10 is a reasonable value.
  2. Updating the file - Updates to the file are only reflected when the Solr process is restarted.

Defining Roles

Keep in mind that role definitions are not cumulative; the newer definition replaces the older one. For example, the following results in role1 having privilege2, not privilege1 and privilege2.
role1 = privilege1
role1 = privilege2

Sample Configuration

This section provides a sample configuration.

  Note: Sentry with CDH Search does not support multiple policy files. Other implementations of Sentry such as Sentry for Hive do support different policy files for different databases, but Sentry for CDH Search has no such support for multiple policies.

Policy File

The following is an example of a CDH Search policy file. The sentry-provider.ini would exist in an HDFS location such as hdfs://ha-nn-uri/user/solr/sentry/sentry-provider.ini.

sentry-provider.ini
[groups]
# Assigns each Hadoop group to its set of roles
engineer = engineer_role
ops = ops_role
dev_ops = engineer_role, ops_role

[roles]
# The following grants all access to source_code.
# "collection = source_code" can also be used as syntactic
# sugar for "collection = source_code->action=*"
engineer_role = collection = source_code->action=*
# The following imply more restricted access.
ops_role = collection = hive_logs->action=Query
dev_ops_role = collection = hbase_logs->action=Query

Sentry Configuration File

The following is an example of a sentry-site.xml file.

sentry-site.xml

<configuration>
  <property>
    <name>hive.sentry.provider</name>
    <value>org.apache.sentry.provider.file.HadoopGroupResourceAuthorizationProvider</value>
  </property>

  <property>
    <name>sentry.solr.provider.resource</name>
    <value>/path/to/authz-provider.ini</value>
    <!-- 
        If the HDFS configuration files (core-site.xml, hdfs-site.xml)
        pointed to by SOLR_HDFS_CONFIG in /etc/default/solr
        point to HDFS, the path will be in HDFS;
        alternatively you could specify a full path, 
        e.g.:hdfs://namenode:port/path/to/authz-provider.ini
    -->
  </property>

Enabling Sentry in Cloudera Search for CDH

To enabling Sentry is achieved by adding two properties to /etc/default/solr. If your Search installation is managed by Cloudera Manager, then these properties are added automatically. If your Search installation is not managed by Cloudera Manager, you must make these changes yourself. The variable SOLR_AUTHORIZATION_SENTRY_SITE specifies the path to sentry-site.xml. The variable SOLR_AUTHORIZATION_SUPERUSER specifies the first part of SOLR_KERBEROS_PRINCIPAL. This is solr for the majority of users, as solr is the default. Settings are of the form:
SOLR_AUTHORIZATION_SENTRY_SITE=/location/to/sentry-site.xml
SOLR_AUTHORIZATION_SUPERUSER=solr

To enable sentry index-authorization checking on a new collection, the instancedir for the collection must use a modified version of solrconfig.xml with Sentry integration. The command solrctl instancedir --generate generates two versions of solrconfig.xml: the standard solrconfig.xml without sentry integration, and the sentry-integrated version called solrconfig.xml.secure. To use the sentry-integrated version, replace solrconfig.xml with solrconfig.xml.secure before creating the instancedir.

If you have an existing collection using the standard solrconfig.xml called "foo" and an instancedir of the same name, perform the following steps:

# generate a fresh instancedir
solrctl instancedir --generate foosecure
# download the existing instancedir from ZK into subdirectory "foo"
solrctl instancedir --get foo foo
# replace the existing solrconfig.xml with the sentry-enabled one
cp foosecure/conf/solrconfig.xml.secure foo/conf/solrconfig.xml
# update the instancedir in ZK
solrctl instancedir --update foo foo
# reload the collection
solrctl collection --reload foo

If you have an existing collection using a version of solrconfig.xml that you have modified, contact Support for assistance.

Enabling Secure Impersonation

Secure Impersonation is a feature that allows a user to make requests as another user in a secure way. For example, to allow the following impersonations:

  • User "hue" can make requests as any user from any host.
  • User "foo" can make requests as any member of group "bar", from "host1" or "host2".
    Configure the following properties in /etc/default/solr:
    SOLR_SECURITY_ALLOWED_PROXYUSERS=hue,foo
    SOLR_SECURITY_PROXYUSER_hue_HOSTS=*
    SOLR_SECURITY_PROXYUSER_hue_GROUPS=*
    SOLR_SECURITY_PROXYUSER_foo_HOSTS=host1,host2
    SOLR_SECURITY_PROXYUSER_foo_GROUPS=bar
SOLR_SECURITY_ALLOWED_PROXYUSERS lists all of the users allowed to impersonate. For a user x in SOLR_SECURITY_ALLOWED_PROXYUSERS, SOLR_SECURITY_PROXYUSER_x_HOSTS list the hosts x is allowed to connect from in order to impersonate, and SOLR_SECURITY_PROXYUSERS_x_GROUPS lists the groups that the users is allowed to impersonate members of. Both GROUPS and HOSTS support the wildcard * and both GROUPS and HOSTS must be defined for a specific user.
  Note: Cloudera Manager has its own management of secure impersonation for Hue. To add additional users for Secure Impersonation, use the environment variable safety value for Solr to set the environment variables as above. Be sure to include "hue" in SOLR_SECURITY_ALLOWED_PROXYUSERS if you want to use secure impersonation for hue.

Debugging Failed Sentry Authorization Requests

Sentry logs all facts that lead up to authorization decisions at the debug level. If you do not understand why Sentry is denying access, the best way to debug is to temporarily turn on debug logging:
  • In Cloudera Manager, add log4j.logger.org.apache.sentry=DEBUG to the logging settings for your service through the corresponding Logging Safety Valve field for the Impala, Hive Server 2, or Solr Server services.
  • On systems not managed by Cloudera Manager, add log4j.logger.org.apache.sentry=DEBUG to the log4j.properties file on each host in the cluster, in the appropriate configuration directory for each service.
Specifically, look for exceptions and messages such as:
FilePermission server..., RequestPermission server...., result [true|false]
which indicate each evaluation Sentry makes. The FilePermission is from the policy file, while RequestPermission is the privilege required for the query. A RequestPermission will iterate over all appropriate FilePermission settings until a match is found. If no matching privilege is found, Sentry returns false indicating "Access Denied".

Appendix: Authorization Privilege Model for Search

The tables below refer to the request handlers defined in the generated solrconfig.xml.secure. If you are not using this configuration file, the below may not apply.

admin is a special collection in sentry used to represent administrative actions. A non-administrative request may only require privileges on the collection on which the request is being performed. This is called collection1 in this appendix. An administrative request may require privileges on both the admin collection and collection1. This is denoted as admin, collection1 in the tables below.

Table 1. Privilege table for non-administrative request handlers
Request Handler Required Privilege Collections that Require Privilege
select QUERY collection1
query QUERY collection1
get QUERY collection1
browse QUERY collection1
tvrh QUERY collection1
clustering QUERY collection1
terms QUERY collection1
elevate QUERY collection1
analysis/field QUERY collection1
analysis/document QUERY collection1
update UPDATE collection1
update/json UPDATE collection1
update/csv UPDATE collection1
Table 2. Privilege table for collections admin actions
Collection Action Required Privilege Collections that Require Privilege
create UPDATE admin, collection1
delete UPDATE admin, collection1
reload UPDATE admin, collection1
createAlias UPDATE admin, collection1
  Note: "collection1" here refers to the name of the alias, not the underlying collection(s). For example, http://YOUR-HOST:8983/ solr/admin/collections?action= CREATEALIAS&name=collection1 &collections=underlyingCollection
deleteAlias UPDATE admin, collection1
  Note: "collection1" here refers to the name of the alias, not the underlying collection(s). For example, http://YOUR-HOST:8983/ solr/admin/collections?action= DELETEALIAS&name=collection1
syncShard UPDATE admin, collection1
splitShard UPDATE admin, collection1
deleteShard UPDATE admin, collection1
Table 3. Privilege table for core admin actions
Collection Action Required Privilege Collections that Require Privilege
create UPDATE admin, collection1
rename UPDATE admin, collection1
load UPDATE admin, collection1
unload UPDATE admin, collection1
status UPDATE admin, collection1
persist UPDATE admin
reload UPDATE admin, collection1
swap UPDATE admin, collection1
mergeIndexes UPDATE admin, collection1
split UPDATE admin, collection1
prepRecover UPDATE admin, collection1
requestRecover UPDATE admin, collection1
requestSyncShard UPDATE admin, collection1
requestApplyUpdates UPDATE admin, collection1
Table 4. Privilege table for Info and AdminHandlers
Request Handler Required Privilege Collections that Require Privilege
LukeRequestHandler QUERY admin
SystemInfoHandler QUERY admin
SolrInfoMBeanHandler QUERY admin
PluginInfoHandler QUERY admin
ThreadDumpHandler QUERY admin
PropertiesRequestHandler QUERY admin
LogginHandler QUERY, UPDATE (or *) admin
ShowFileRequestHandler QUERY admin