This is the documentation for Cloudera Manager 4.8.2.
Documentation for other versions is available at Cloudera Documentation.

Introducing Cloudera Manager

Deployment and ongoing administration of a Hadoop stack can be difficult and time consuming. Deciding which components and versions to deploy based on use cases; assigning roles for nodes; effectively configuring, starting and managing services across the cluster; and performing diagnostics to optimize cluster performance requires significant expertise.

Cloudera Manager is the industry's first end-to-end management application for Apache Hadoop. By delivering granular visibility into and control over the every part of the Hadoop cluster, Cloudera Manager empowers enterprise operators to improve cluster performance, enhance quality of service, increase compliance and reduce administrative costs.

Cloudera Manager provides many useful features for monitoring the health and performance of the components of your cluster (hosts, service daemons) as well as the performance and resource demands of the user jobs running on your cluster.

With Cloudera Manager, you can easily deploy and centrally operate a complete Hadoop stack. The application automates the installation process, reducing deployment time from weeks to minutes; gives you a cluster-wide, real time view of the services running and the status of their hosts; provides a single, central place to enact configuration changes across your cluster; and incorporates a full range of reporting and diagnostic tools to help you optimize cluster performance and utilization.

Cloudera Manager provides full lifecycle management for Apache Hadoop.

  • Installs the complete Hadoop stack in minutes via a wizard-based interface.
  • Lets you install multiple clusters, with the choice of running CDH3 or CDH4 on a given cluster.
  • Gives you complete, end-to-end visibility and control over your Hadoop clusters from a single interface.
  • Correlates jobs, activities, logs, system changes, configuration changes, service and host metrics along a single timeline to simplify diagnosis.
  • Lets you set server roles, configure services and manage security across the cluster.
  • Lets you gracefully start, stop and restart services as needed.
  • Maintains a complete record of configuration changes with the ability to roll back to previous states.
  • Automatically deploys client configuration files for the services you have installed.
  • Supports HDFS High Availability using either Quorum-based storage (introduced with CDH 4.1) for its shared directory, or an NFS-mounted shared edits directory.
  • Monitors dozens of service performance metrics and alerts you when you approach critical thresholds.
  • Lets you gather, view and search Hadoop logs collected from across the cluster.
  • Creates and aggregates relevant Hadoop events pertaining to system health, log messages, user services and activities and makes them available for alerting (by email) and searching.
  • Consolidates cluster activity (user jobs) into a single, real-time view.
  • Lets you drill down into individual workflows and jobs at the task attempt level to diagnose performance issues.
  • Shows information pertaining to hosts in your cluster including status, resident memory, virtual memory and roles.
  • Monitors the available space in log and other directories used by Cloudera Manager and CDH components.
  • Provides operational reports on current and historical disk usage by user, group, and directory, as well as MapReduce activity on the cluster by job or user.
  • Takes a snapshot of the cluster state and automatically sends it to Cloudera support to assist with resolution.

You work primarily in the Cloudera Manager Admin Console in a web browser that is connected to the Cloudera Manager Server, where you can manage the configuration settings, monitor the health of your services, and monitor and track user activity on your cluster.