This is the documentation for Cloudera Manager 4.8.2.
Documentation for other versions is available at Cloudera Documentation.

Installing and Configuring a MySQL Database

You can use MySQL databases to store information for Cloudera Manager. Cloudera Manager monitoring services require databases to store information, so you will need to create databases for each of those entities.

You should also create a database for the Hive Metastore Server.

See your MySQL documentation for more information about installation and configuration.

  Note: If you already have a MySQL database set up, you can skip to the section Configuring MySQL for the Cloudera Manager Databases to verify that your MySQL configurations meet the requirements for Cloudera Manager.

Installing MySQL

To install MySQL on a Red Hat system:

$ sudo yum install mysql-server

To install MySQL on a SLES system:

$ sudo zypper install mysql
$ sudo zypper install libmysqlclient_r15
  Note:

Some SLES systems encounter errors when using the preceding zypper install command. For more information on resolving this issue, see Novell's Knowledgebase topic, error running chkconfig.

To install MySQL on an Debian/Ubuntu system:

$ sudo apt-get install mysql-server

After issuing the command to install MySQL, you may need to respond to prompts to confirm that you do want to complete the installation. After installation completes, start the mysql daemon.

On Red Hat systems

$ sudo service mysqld start

On SLES and Debian/Ubuntu systems

$ sudo service mysql start

Configuring MySQL for the Cloudera Manager Databases

The default settings in the MySQL installations in most distributions are very conservative with regards to buffer sizes and memory usage. In addition, Cloudera Manager requires InnoDB. Cloudera Management services need high write throughput as, based on cluster size, they may insert a lot of records in the database. Therefore Cloudera recommends that you set innodb_flush_method to O_DIRECT.

For the databases used by Cloudera Manager, Cloudera recommends that you update /etc/my.cnf or /etc/mysql/my.cnf to at least the values shown below. It is important that the datadir directory, which, by default, is /var/lib/mysql, is on a partition that has plentiful free space.

Recommended Settings

  1. Determine the version of MySQL.

      Important:
    • For a fresh MySQL installation on Red Hat or SLES systems, apply the settings in the next step before you start MySQL for the first time.
    • For an existing installation and a new installation on Ubuntu, you must perform the steps in Reconfiguring InnoDB Settings for an Existing MySQL Installationbefore changing InnoDB settings.
  2. Recommended MySQL configurations settings are as follows. You should incorporate these changes as appropriate into your configuration settings.
    [mysqld]
    transaction-isolation=READ-COMMITTED
    # Disabling symbolic-links is recommended to prevent assorted security risks;
    # to do so, uncomment this line:
    # symbolic-links=0
    
    key_buffer              = 16M
    key_buffer_size         = 32M
    max_allowed_packet      = 16M
    thread_stack            = 256K
    thread_cache_size       = 64
    query_cache_limit       = 8M
    query_cache_size        = 64M
    query_cache_type        = 1
    # Important: see Configuring the Databases and Setting max_connections
    max_connections         = 550
    
    # log-bin should be on a disk with enough free space
    log-bin=/x/home/mysql/logs/binary/mysql_binary_log
    
    # For MySQL version 5.1.8 or later. Comment out binlog_format for older versions.
    binlog_format           = mixed
    
    read_buffer_size = 2M
    read_rnd_buffer_size = 16M
    sort_buffer_size = 8M
    join_buffer_size = 8M
    
    # InnoDB settings
    innodb_file_per_table = 1
    innodb_flush_log_at_trx_commit  = 2
    innodb_log_buffer_size          = 64M
    innodb_buffer_pool_size         = 4G
    innodb_thread_concurrency       = 8
    innodb_flush_method             = O_DIRECT
    innodb_log_file_size = 512M
    
    [mysqld_safe]
    log-error=/var/log/mysqld.log
    pid-file=/var/run/mysqld/mysqld.pid

Configuring the Databases and Setting max_connections

The definition of a small or large cluster is not absolute, so this information is intended as general guidance. For the purposes of this discussion, clusters with fewer than 50 hosts can be considered small clusters and clusters with more than 50 hosts can be considered large clusters.

Follow these guidelines:

  • In a small cluster, you can store more than one database (for example, both the Activity Monitor and Service Monitor) on the same host. If you do this, you should:
  • Put each database on its own storage volume.
  • Allow 100 maximum connections for each database and then add 50 extra connections. For example, for two databases set the maximum connections to 250. If you store seven databases on one host (the databases for Activity Monitor, Service Monitor, Report Manager, Host Monitor, Cloudera Manager Server, Cloudera Navigator, and Hive Metastore), set the maximum connections to 750.
  • In a large cluster, do not store more than one database on the same host. In such a case, use a separate host for each database/host pair. The hosts need not be reserved exclusively for databases, but each database should be on a separate host.

Reconfiguring InnoDB Settings for an Existing MySQL Installation

To update InnoDB settings on all hosts that are using an existing MySQL installation, proceed as follows.

  1. Stop MySQL.
    • Red Hat
      $ sudo service mysqld stop
    • SLES and Debian/Ubuntu
      $ sudo service mysql stop
      • Edit the InnoDB entries in /etc/my.cnf as shown in the previous section.
      • Move the old InnoDB log files to a backup location. The two files to move are /var/lib/mysql/ib_logfile0 and /var/lib/mysql/ib_logfile1. Make sure you move these files out of the /var/lib/mysql/ directory (don't copy them and leave the originals in place).
      • Start MySQL.
    • Red Hat
      $ sudo service mysqld start
    • SLES and Debian/Ubuntu
      $ sudo service mysql start

Installing the MySQL JDBC Connector

Install the JDBC connector on the Cloudera Manager Server host, as well as hosts to which you assign the Activity Monitor, Service Monitor, Report Manager, Host Monitor, Cloudera Navigator, and Hive Metastore Server roles.

  Note: If you already have the JDBC connector installed on the hosts that need it, you can skip this section.
Cloudera recommends that you assign all roles that require databases on the same host and install the connector on that host. While putting all such roles on the same host is recommended, it is not required. You could install a role, such as Activity Monitor on one host and other roles on a separate host. In such a case you would install the JDBC connector on each host running roles that access the database.
  • Red Hat 6
    $ sudo yum install mysql-connector-java
  • Red Hat 5
    1. Download the MySQL JDBC connector from http://www.mysql.com/downloads/connector/j/5.1.html.
    2. Extract the JDBC driver JAR file from the downloaded file; for example:
      tar zxvf mysql-connector-java-5.1.18.tar.gz
    3. Add the JDBC driver, renamed, to the relevant server; for example:
      $ sudo cp mysql-connector-java-5.1.18/mysql-connector-java-5.1.18-bin.jar /usr/share/java/mysql-connector-java.jar

      If the target directory does not yet exist on this host, you can create it before copying the .jar file; for example:

      $ sudo mkdir -p /usr/share/java/
      $ sudo cp mysql-connector-java-5.1.18/mysql-connector-java-5.1.18-bin.jar /usr/share/java/mysql-connector-java.jar
  • SLES
    $ sudo zypper install mysql-connector-java
  • Debian/Ubuntu
    $ sudo apt-get install libmysql-java

Configuring MySQL

Configure MySQL to use a strong password and to start at boot.

  1. Set the MySQL root password. Note that in the following procedure, your current root password is blank. Press the Enter key when you're prompted for the root password.
    $ sudo /usr/bin/mysql_secure_installation
    [...]
    Enter current password for root (enter for none):
    OK, successfully used password, moving on...
    [...]
    Set root password? [Y/n] y
    New password:
    Re-enter new password:
    Remove anonymous users? [Y/n] Y
    [...]
    Disallow root login remotely? [Y/n] N
    [...]
    Remove test database and access to it [Y/n] Y
    [...]
    Reload privilege tables now? [Y/n] Y
    All done!
  2. Ensure the MySQL server starts at boot.
    • Red Hat
      $ sudo /sbin/chkconfig mysqld on
      $ sudo /sbin/chkconfig --list mysqld
      mysqld          0:off   1:off   2:on    3:on    4:on    5:on    6:off
    • SLES
      $ sudo chkconfig --add mysql
    • Debian/Ubuntu
      $ sudo chkconfig mysql on

Creating the MySQL Databases for Cloudera Manager

The next step involves creating databases and user accounts for all database-backed services in Cloudera Manager.

Create databases for each of the following features that are part of the Cloudera Management Services:

  • Activity Monitor
  • Service Monitor
  • Report Manager
  • Host Monitor
  • Cloudera Navigator (optional -- Cloudera Navigator is a separately-purchased add-on to Cloudera Enterprise)

In addition add a database for each Hive Metastore Server.

You can create these databases on the host where the Cloudera Manager Server will run, or on any other nodes in the cluster. For performance reasons, you should typically install each database on the host on which the service runs, as determined by the roles you will assign during installation or upgrade. In larger deployments or in cases where database administrators (DBAs) are managing the databases the services will use, databases may be separated from services, but do not undertake such an implementation lightly.

Note the values you enter for database names, user names, and passwords. The Cloudera Manager installation wizard requires this information to correctly connect to these databases.

The database must be configured to support UTF-8 character set encoding. The sample commands below include the required options to enable UTF-8 support.

  1. Log into MySQL as the root user:
    $ mysql -u root -p
    Enter password:
  2. Create a database for the Activity Monitor.
      Note: The database name, user name, and password can be anything you want. The examples shown match the default names provided in the Cloudera Manager Hive configuration settings.
    mysql> create database amon DEFAULT CHARACTER SET utf8;
    Query OK, 1 row affected (0.00 sec)
    
    mysql> grant all on amon.* TO 'amon'@'%' IDENTIFIED BY 'amon_password';
    Query OK, 0 rows affected (0.00 sec)
  3. Create a database for the Service Monitor. For example:
    mysql> create database smon DEFAULT CHARACTER SET utf8;
    Query OK, 1 row affected (0.00 sec)
    
    mysql> grant all on smon.* TO 'smon'@'%' IDENTIFIED BY 'smon_password';
    Query OK, 0 rows affected (0.00 sec)
  4. Create a database for the Report Manager. For example:
    mysql> create database rman DEFAULT CHARACTER SET utf8;
    Query OK, 1 row affected (0.00 sec)
    
    mysql> grant all on rman.* TO 'rman'@'%' IDENTIFIED BY 'rman_password';
    Query OK, 0 rows affected (0.00 sec)
  5. Create a database for the Host Monitor. For example:
    mysql> create database hmon DEFAULT CHARACTER SET utf8;
    Query OK, 1 row affected (0.00 sec)
    
    mysql> grant all on hmon.* TO 'hmon'@'%' IDENTIFIED BY 'hmon_password';
    Query OK, 0 rows affected (0.00 sec)
  6. (Optional) Create a database for Cloudera Navigator. For example:
    mysql> create database nav DEFAULT CHARACTER SET utf8;
    Query OK, 1 row affected (0.00 sec)
    
    mysql> grant all on nav.* TO 'nav'@'%' IDENTIFIED BY 'nav_password';
    Query OK, 0 rows affected (0.00 sec)

Create the Database for the Hive Metastore and Impala Catalog Daemon

Create a separate metastore for each Hive service, if you have more than one.
  1. Create a database for the Hive metastore. For example:
    mysql> create database hive DEFAULT CHARACTER SET utf8;
    Query OK, 1 row affected (0.00 sec)
    
    mysql> grant all on hive.* TO 'hive'@'%' IDENTIFIED BY 'hive_password';
    Query OK, 0 rows affected (0.00 sec)

Backing Up the MySQL Databases

Cloudera recommends that you periodically back up the databases that Cloudera Manager uses to store configuration, monitoring, and reporting data and for managed services:
  • Cloudera Manager database: Contains all the information about what services you have configured, their role assignments, all configuration history, commands, users, and running processes. This is a relatively small database (<100MB), and is the most important to back up.
  • Activity Monitor database: Contains information about past activities. In large clusters, this database can grow large.
  • Service Monitor database: Contains monitoring information about daemons. In large clusters, this database can grow large.
  • Report Manager database: Keeps track of disk utilization over time. Medium-sized.
  • Host Monitor database: Contains information about host status. In large clusters, this database can grow large.
  • Cloudera Navigator database: Contains auditing information. In large clusters, this database can grow large.
  • Hive Metastore database: Contains Hive metadata. Relatively small.
To back up the MySQL database, run the mysqldump command on the MySQL host, as follows:
$ mysqldump -h<hostname> -u<username> -p<password> <database> > /tmp/<database-backup>.sql

For example, to back up the sample database created for the Activity Monitor above, amon on the local host as the root user, with the password mypasswd:

$ mysqldump -pmypasswd amon > /tmp/amon-backup.sql

To back up the sample Activity Monitor database amon on remote host myhost.example.com as the root user, with the password mypasswd:

$ mysqldump -hmyhost.example.com -uroot -pcloudera amon > /tmp/amon-backup.sql