— MapReduce applications from CDH3 must be recompiled in CDH4.
Users will need to recompile their applications when going from CDH3 to CDH4 (even to use MRv1). Note that once applications have been compiled with CDH4 libraries, they will not need to be recompiled to move from MRv1 to MRv2 (YARN) in CDH4.
Anticipated resolution: None planned
— Streaming jobs may not be recovered successfully when you use CDH4 MRv1 with Cloudera Manager 4.0.x
When job recovery is enabled (mapred.jobtracker.restart.recover is set to true), streaming jobs may not be recovered successfully if you are using CDH4 with Cloudera Manager 4.0.x.
Resolution: None; use workaround.
Workaround: Use Cloudera Manager 4.1 or later and set Automatically Restart Process to false in Cloudera Manager.
— No JobTracker becomes active if both JobTrackers are migrated to other hosts
If JobTrackers in an High Availability configuration are shut down, migrated to new hosts, then restarted, no JobTracker becomes active. The logs show a Mismatched address exception.
$ zkCli.sh rmr /hadoop-ha/<logical name>
— Hadoop Pipes may not be usable in an MRv1 Hadoop installation done through tarballs
Under MRv1, MapReduce's C++ interface, Hadoop Pipes, may not be usable with a Hadoop installation done through tarballs unless you build the C++ code on the operating system you are using.
Resolution: None planned; use workaround.
Workaround: Build the C++ code on the operating system you are using. The C++ code is present under src/c++ in the tarball.
— Default port conflicts
By default, the Shuffle Handler (which runs inside the YARN NodeManager), the REST server, and many third-party applications, all use port 8080. This will result in conflicts if you deploy more than one of them without reconfiguring the default port.
Workaround: Make sure at most one service uses port 8080. To reconfigure the REST server, follow these instructions. To change the default port for the Shuffle Handler, set the value of mapreduce.shuffle.port in mapred-site.xml to an unused port.
— Task-completed percentage may be reported as slightly under 100% in the web UI, even when all of a job's tasks have successfully completed.
— Spurious warning in MRv1 jobs
The mapreduce.client.genericoptionsparser.used property is not correctly checked by JobClient and this leads to a spurious warning.
Workaround: MapReduce jobs using GenericOptionsParser or implementing Tool can remove the warning by setting this property to true.
— Oozie workflows will not be recovered in the event of a JobTracker failover on a secure cluster
Delegation tokens created by clients (via JobClient#getDelegationToken()) do not persist when the JobTracker fails over. This limitation means that Oozie workflows will not be recovered successfully in the event of a failover on a secure cluster.
Workaround: Re-submit the workflow.
— Encrypted shuffle in MRv2 does not work if used with LinuxContainerExecutor and encrypted web UIs.
In MRv2, if the LinuxContainerExecutor is used (usually as part of Kerberos security), and hadoop.ssl.enabled is set to true (See Configuring Encrypted Shuffle, Encrypted Web UIs, and Encrypted HDFS Transport), then the encrypted shuffle does not work and the submitted job fails.
Workaround: Use encrypted shuffle with Kerberos security without encrypted web UIs, or use encrypted shuffle with encrypted web UIs without Kerberos security.
— Link from ResourceManager to Application Master does not work when the Web UI over HTTPS feature is enabled.
In MRv2 (YARN), if hadoop.ssl.enabled is set to true (use HTTPS for web UIs), then the link from the ResourceManager to the running MapReduce Application Master fails with an HTTP Error 500 because of a PKIX exception.
A job can still be run successfully, and, when it finishes, the link to the job history does work.
Workaround: Don't use encrypted web UIs.
— Pipe jobs compiled against MRv1 cannot be run on MRv2 (YARN).
Hadoop Pipes jobs compiled against CDH versions prior to CDH4 will need to be recompiled against CDH4 YARN before you can run them on the new YARN framework. Also, pipes jobs compiled against CDH4 MRv1 will need to be recompiled against CDH4 YARN before you can run them on YARN.
Workaround: Job must be recompiled to run on MRv2.
— Hadoop client JARs don't provide all the classes needed for clean compilation of client code
$ javac -cp '/usr/lib/hadoop/client/*' -d wordcount_classes WordCount.java org/apache/hadoop/fs/Path.class(org/apache/hadoop/fs:Path.class): warning: Cannot find annotation method 'value()' in type 'org.apache.hadoop.classification.InterfaceAudience.LimitedPrivate': class file for org.apache.hadoop.classification.InterfaceAudience not found 1 warning
— The ulimits setting in /etc/security/limits.conf is applied to the wrong user if security is enabled.
Resolution: None; use workaround
Workaround: To increase the ulimits applied to DataNodes, you must change the ulimit settings for the root user, not the hdfs user.
—Must set yarn.resourcemanager.scheduler.address to routable host:port when submitting a job from the ResourceManager
When you submit a job from the ResourceManager, yarn.resourcemanager.scheduler.address must be set to a real, routable address, not the wildcard 0.0.0.0.
Resolution: None; use workaround
Workaround: Set the address, in the form host:port, either in the client-side configuration, or on the command line when you submit the job.