- The CDH4 cluster must have a MapReduce service running on it. This may be MRv1 or YARN (MRv2).
- All the MapReduce nodes in the CDH4 cluster should have full network access to all the nodes of the source cluster. This allows you to perform the copy in a distributed manner.
The term source refers to the CDH3 (or other Hadoop) cluster you want to migrate or copy data from; and destination refers to the CDH4 cluster.