This is the documentation for CDH 4.6.0.
Documentation for other versions is available at Cloudera Documentation.

Apache Flume

— Hive sink support

Flume does not provide a native sink that stores the data that can be directly consumed by Hive.

Bug: FLUME-1008

Severity: Medium

Workaround: None

— HBase sink does not work out of the box

The HBase sink does not work out of the box in Flume; it fails to connect to Zookeeper.

Bug: HBASE-4072

Severity: Medium

Workaround: There are two workarounds:
  • Remove the file /etc/zookeeper/conf/zoo.cfg from the Flume client machines, and specify the Zookeeper details in hbase-site.xml or flume.conf; OR
  • Use CDH 4.5 or later with Cloudera Manager 4.7 or later. Cloudera Manager passes a flag that causes Flume not to include any files named zoo.cfg in its classpath at startup time. The flag is enabled by default.
      Note:

    If you need to do so for backward compatibility purposes, you can disable this flag as follows:

    1. Go to Flume service -> Configuration -> Agent -> Advanced.
    2. Disable the HBase sink prefer hbase-site.xml over Zookeeper config property.
    3. Restart the Flume service.
    This will allow files named zoo.cfg to be included in Flume's classpath.

— Fast Replay does not work with encrypted File Channel

If an encrypted file channel is set to use fast replay, the replay will fail and the channel will fail to start.

Bug: FLUME-1885 (unresolved as of 2/21/14)

Severity: Low

Workaround: Disable fast replay for the encrypted channel by setting use-fast-replay to false.