Upgrade from Cloudera Manager 4 to the Latest Version of Cloudera Manager 4
Upgrading from Cloudera Manager 4 Free Edition to the latest version of Cloudera Manager is a relatively simple process, which centers on upgrading Cloudera Manager Server packages. This process applies to upgrading from Cloudera Manager 4 to a newer version of Cloudera Manager 4. For example, this process applies to upgrading Cloudera Manager Free Edition 4.0.2 or Cloudera Manager Free Edition 4.1.2 to Cloudera Manager Free Edition 4.5.
To complete the upgrade, you stop Cloudera Manager, upgrade the packages, and then start Cloudera Manager again.
It is possible to complete the following upgrade without shutting down the Hadoop services. Hadoop daemons can continue running, unaffected, while Cloudera Manager is upgraded.
Cloudera Manager 4.5 added support for Hive, which includes a new role type called the Hive Metastore Server. This role manages the metastore process when Hive is configured with a remote metastore.
When upgrading from a previous CDH version, Cloudera Manager automatically creates new Hive service(s) to capture the previous implicit Hive dependency from Hue and Impala. Your previous services will continue to function without impact.
Note that if Hue was using a Hive metastore of type Derby, then the newly created Hive service will also use Derby. But since Derby does not allow concurrent connections, Hue will continue to work, but the new Hive Metastore Server will fail to run. The failure is harmless (because nothing uses this new Hive Metastore Server at this point) and intentional, to preserve the set of cluster functionality as it was before upgrade. Cloudera discourages the use of a Derby metastore due to its limitations. You should consider switching to a different supported database type (PostgreSQL, MySQL, Oracle).
Cloudera Manager provides a Hive configuration option to bypass the Hive Metastore server. When this configuration is enabled, Hive clients, Hue, and Impala connect directly to the Hive Metastore Database. Prior to Cloudera Manager 4.5, Hue and Impala talked directly to the Hive Metastore Database, so the Bypass mode is enabled by default when upgrading to Cloudera Manager 4.5. This is to ensure the upgrade doesn't disrupt your existing setup. You should plan to disable the Bypass Hive Metastore Server mode, especially when using CDH 4.2 or later. Using the Hive Metastore Server is the recommended configuration. After changing this configuration, you must re-deploy your client configurations, restart Hive, and restart any Hue or Impala services configured to use that Hive.
Cloudera Manager 4.5 also supports Hive Server2 with CDH4.2. Hive Server2 is not added by default, but can be added as a new role under the Hive service (see Adding Role Instances).
Cloudera Manager 4.0 can manage CDH3 and CDH4, but cannot manage CDH4.0 beta. If you upgrade to Cloudera Manager 4.0, you must upgrade any installations of CDH4.0 beta, as well.
Summary: What You are Going to Do
Upgrading from Cloudera Manager 4.0 to the latest version of Cloudera Manager involves the following broad steps:
Step 1. (Optional) Stop the Hive, Hue, Oozie, and Impala services.
This step applies if you are upgrading from Cloudera Manager 4.5 to a newer version; in Cloudera Manager 4.5, these services access the embedded database (for the Hive Metastore) and you will not be able to stop the database while these services are running.
- From the Services tab select All Services in the Cloudera Manager Admin Console.
- Choose Stop on the Actions menus for the Hive, Hue and Oozie services. Do the same for Impala if you have it running.
Step 2. Upgrade the Cloudera Manager Server
In this step, you upgrade the Cloudera Manager Server packages to the latest version.
- Stop the server and the server's database on the Cloudera Manager Server host using the following commands:
$ sudo service cloudera-scm-server stop
- Stop the embedded database on the Cloudera Manager Server host:
$ sudo service cloudera-scm-server-db stop
If you are not using the embedded database, you should skip this step.
- Install the new version of the server. To install the new version, you can upgrade from Cloudera's repository at http://archive.cloudera.com/cm4/. Alternately, you can create your own repository, as described in Appendix A - Understanding Custom Installation Solutions. Creating your own repository is necessary if you are upgrading a cluster that does not have access to the Internet.
On a Red Hat system
- Find Cloudera's repo file for your distribution by starting at http://archive.cloudera.com/cm4/ and navigating to the directory that matches your operating system. For example, for RedHat or CentOS 6, you would navigate to http://archive.cloudera.com/cm4/redhat/6/x86_64/cm/. Within that directory, find the repo file that contains information including the repository's base URL and gpgkey. In the preceding example, the contents of the cloudera-manager.repo file might appear as follows:
[cloudera-manager] # Packages for Cloudera Manager, Version 4, on RedHat or CentOS 5 x86_64 name=Cloudera Manager baseurl=http://archive.cloudera.com/cm4/redhat/5/x86_64/cm/4/ gpgkey = http://archive.cloudera.com/cm4/redhat/5/x86_64/cm/RPM-GPG-KEY-cloudera gpgcheck = 1
Copy this repo file to the configuration location for the package management software for your system. For example, with Red Hat 6, you would copy the cloudera-manager.repo file to /etc/yum.repos.d/.
- After verifying that you have the correct repo file, run the following commands:
$ sudo yum clean all $ sudo yum update 'cloudera-*'
- yum clean all cleans up yum's cache directories, ensuring that you download and install the latest versions of the packages.
- If your system is not up to date, and any underlying system components need to be upgraded before this yum update can succeed, yum will tell you what those are.
On a SLES system:
- To install the latest version from Cloudera's repository, run the following commands:
$ sudo zypper clean --all $ sudo zypper up -r http://archive.cloudera.com/cm4/sles/11/x86_64/cm/4/
To download from your own repository:
$ sudo zypper clean --all $ sudo zypper rr cm $ sudo zypper ar -t rpm-md http://myhost.example.com/path_to_cm_repo/ cm $ sudo zypper up -r http://myhost.example.com/path_to_cm_repo
On a Debian/Ubuntu system;
- Use the following commands to clean cached repository information and update Cloudera Manager components:
$ sudo apt-get clean $ sudo apt-get update $ sudo apt-get install cloudera-manager-server cloudera-manager-agent cloudera-manager-daemons
As this process proceeds, you may be prompted concerning your configuration file version:
Configuration file `/etc/cloudera-scm-agent/config.ini' ==> Modified (by you or by a script) since installation. ==> Package distributor has shipped an updated version. What would you like to do about it ? Your options are: Y or I : install the package maintainer's version N or O : keep your currently-installed version D : show the differences between the versions Z : start a shell to examine the situation The default action is to keep your current version.
You will receive a similar prompt for /etc/cloudera-scm-server/db.properties. Answer N to both these prompts.
At the end of this process you should have the 4.5 versions of the following packages installed on the host that will become the Cloudera Manager Server host. For example,
$ rpm -qa 'cloudera-manager-*' cloudera-manager-daemons-4.5.0-1.cm450.p0.235.x86_64 cloudera-manager-server-4.5.0-1.cm450.p0.235.x86_64 cloudera-manager-agent-4.5.0-1.cm450.p0.235.x86_64
You may also see additional packages for plugins, depending on what was previously installed on the Server host. If the commands to update the server complete without errors, you can assume the upgrade has completed as desired. For additional assurance, you will have the option to check that the server versions have been updated after you start the server. The process of checking the server version is described in Step 5. Verify the Upgrade Succeeded.
If the commands to update the server complete without errors, you can assume the upgrade has completed as desired. For additional assurance, you will have the option to check that the server versions have been updated after you start the server. The process of checking the server version is described in Step 5. Verify the Upgrade Succeeded.
Step 3. Start the Server
To start the server
On the Cloudera Manager Server host (the system on which you installed the cloudera-manager-server.noarch package) do the following:
$ sudo service cloudera-scm-server-db start $ sudo service cloudera-scm-server start
The sudo service cloudera-scm-server-db start command is necessary if you are using the embedded PostgreSQL database.
You should see the following:
Starting cloudera-scm-server: [ OK ]
If you have problems starting the server, such as database permissions problems, you can use the server's log /var/log/cloudera-scm-server/cloudera-scm-server.log to troubleshoot the problem.
Step 4: Upgrade the Cluster Hosts
Cloudera Manager can automatically upgrade existing agents. After you upgrade Cloudera Manager, when it is started for the first time, it checks for any older versions of agents. If older agents are detected, Cloudera Manager provides the opportunity to automatically update agents, which you should do, unless you have some reason not to do so. After updating Cloudera Manager, connect to Cloudera Manager and use the wizard to continue the upgrade process. In this part of the process, the Cloudera Manager agents are updated and databases are updated.
All hosts in the cluster must have access to the Internet if you plan to use archive.cloudera.com as the source for installation files. If you do not have Internet access, create a custom repository.
- Log in to the Cloudera Manager Admin Console. If you have just restarted the Cloudera Manager server, you may need to log in again.
- On the Welcome screen, click Continue to proceed to the Upgrade cluster hosts screen.
- On the Upgrade cluster hosts screen, verify that the hosts you want to upgrade appear. You can search for additional hosts, if you need to, by entering their hostnames or IP addresses under Add new hosts to cluster and clicking Find Hosts. When all the hosts are shown as being managed, click Continue.
- Select the release of the Cloudera Manager Agent to install. Normally, this will be the Matched Release for this Cloudera Manager Server. However, if you used a custom repository for the Cloudera Manager server, select Custom Repository and provide the required information Click Continue to proceed.
- Provide credentials for authenticating with hosts.
- Select root or enter the user name for an account that has password-less sudo permissions.
- Select an authentication method.
- If you choose to use password authentication, enter and confirm the password.
- If you choose to use public-key authentication provide a passphrase and path to the required key files.
- You can choose to specify an alternate SSH port. The default value is 22.
- You can specify the maximum number of host installations to run at once. The default value is 10.
- Click Start Installation to install and start Cloudera Manager Agents. The status of installation on each host is displayed on the page that appears after you click Start Installation. You can also click the Details link for individual hosts to view detailed information about the installation and error messages if installation fails on any hosts.
If you click the Abort Installation button while installation is in progress, it will halt any pending or in-progress installations and roll back any in-progress installations to a clean state. The Abort Installation button does not affect host installations that have already completed successfully or already failed.
If installation fails on a host, you can click the Retry link next to the failed host to try installation on that host again. To retry installation on all failed hosts, click Retry Failed Hosts at the bottom of the screen.
- When the Continue button appears at the bottom of the screen, the installation process is complete. If the installation has completed successfully on some hosts but failed on others, you can click Continue if you want to skip installation on the failed hosts.
- The Host Inspector runs to inspect your managed hosts for correct versions and configurations. If there are problems, you can make changes and them re-run the inspector. When you are satisfied with the inspection results, click Continue.
- On the next page, select the host where the Hive MetaStore Server role should be installed. The Hive service is now managed by Cloudera Manager; you must select a host for the Hive MetaStore Server. You should assign the Hive Metastore server to a single host.
- Review the configuration values for your Hive roles, and click Accept to continue.
If Hue is using a Hive metastore of type Derby (the default), then the newly created Hive service will also use Derby. However, since Derby does not allow concurrent connection, Hue will continue to work but the new Hive Metastore Server will fail to start. The failure is harmless (because nothing uses this new Hive Metastore Server at this point) and intentional, to preserve the cluster functionality that existed before the upgrade. If you are upgrading to CM 4.5 or later from a release prior to 4.5 (i.e. 4.1 or earlier) Hive's metastore bypass mode is enabled by default. You should plan to disable the Bypass Hive Metastore Server mode, especially when using CDH 4.2 or later. Using the Hive Metastore Server is the recommended configuration. After changing this configuration, you must re-deploy your client configurations, restart Hive, and restart any Hue or Impala services configured to use that Hive.
- Your services (except for Hive) should now be running.
You have now completed the upgrade to Cloudera Manager 4.5.
Step 5. Verify the Upgrade Succeeded
If the commands to update and start the server complete without errors, you can assume the upgrade has completed as desired. For additional assurance, you can check that the server versions have been updated.
After you upgrade Cloudera Manager, when it is started for the first time, it checks for any older versions of agents. If older agents are detected, Cloudera Manager provides the opportunity to automatically update agents using the upgrade wizard.
To verify the server upgrade completed as expected
- Connect to the Cloudera Manager Admin Console as described in Cloudera Manager Free Edition User Guide.
- Click the Hosts tab.
- Click Host Inspector.
- Click Show Inspector Results.
All results from the host inspector process are displayed including the currently installed versions. If this includes listings of current component versions, the installation completed as expected.
Step 6. Add Hive Gateway roles to hosts where Hive clients should run.
- In the Cloudera Manager Admin console, pull down the Services tab and select the Hive service.
- Go to the Instances tab, and click the Add button. This opens the Add Role Instances page.
- Select the hosts on which you want a Hive Gateway role to run. This will ensure that the Hive client configurations are deployed on these hosts.
Step 7. Restart the services you stopped in Step 1
You must restart the services (Hive, Hue, Impala) that you stopped at the beginning of this procedure.
- From the Services tab select All Services in the Cloudera Manager Admin Console.
- Choose Start on the Actions menu for the each service you need to start..
Testing the Installation
When you have finished the upgrade to Cloudera Manager, you can test the installation; follow instructions under Testing the Installation.
Step 8. Deploy Updated Client Configurations
During upgrades between major versions, resource locations may change. To ensure clients have current information about resources, update client configuration as described in Deploying Client Configuration Files.
Step 9. (Optional) Upgrade CDH
Cloudera Manager Free Edition 4.x can manage both CDH3 and CDH4, so upgrading existing CDH3 installations is not required, but to get the benefits of CDH4, you may want to upgrade to the latest version. See the following topics for more information on upgrading CDH:
- Upgrading to the Latest Version of CDH4 in a Cloudera Managed Deployment - Follow this path to upgrade existing installations of CDH4 to the latest version of CDH4.
- Upgrading CDH3 to CDH4 in a Cloudera Managed Deployment - Follow this path to upgrade existing installations of CDH3 to the latest version of CDH4. You can also install Impala when you upgrade to CDH4 version 4.1.2 or later.
- Upgrading to the Latest Version of CDH3 in a Cloudera Managed Deployment - Follow this path to upgrade existing installations of CDH3 to the latest version of CDH3. Consider upgrading to CDH4 instead of upgrading to the latest version of CDH3.