If you have an HBase application written against CDH4.1.2 HBase, you can upgrade your HBase servers (RegionServer, Master) to CDH4.4.0 without upgrading the clients.
If you have an HBase application written against CDH4.4.0 HBase, you can rollback your HBase servers (RegionServer, Master) to CDH4.1.2 without rolling back the clients.
HBase Client Upgrade Incompatibilities
Upgrading to CDH4.0 or CDH4.1
Upgrading from CDH3 or CDH4 Beta will require you to update the HBase client code and recompile it.
Upgrading to CDH4.2 and later
Programs using the HBase client libraries from before CDH4 Beta 1 must replace the HBase JAR file with the one from CDH4.2, in order to inter-operate with CDH4.2. Additionally, the ZooKeeper data format has changed, so any client programs that interact directly with the HBase ZooKeeper information must be recompiled against CDH4.2 client libraries in order to inter-operate with CDH4.1 HBase.
- As of CDH4.2, the default Split Policy changed from ConstantSizeRegionSplitPolicy to IncreasingToUpperBoundRegionSplitPolicy (ITUBRSP)
The following upgrades introduce several new features to the HBase server but these have been turned off by default. None of these incompatibilities come into play unless you actually turn on these features. However, if you do turn them on, you will not be able to roll back to a previous version of CDH4.The two main incompatible changes introduced are listed below.
HBase 0.94 checksums are backward-incompatible with 0.92, which was the version delivered in CDH4.1.x. HBase 0.94, delivered in CDH4.2, introduces a new Hfile format, V2.1. This format is incompatible with the format used in 0.92 in two ways: the data type for the version number is different, and checksums are stored in the internal data blocks. Neither of these incompatibilities comes into play until checksums are turned on, so CDH4.2 HBase turns checksums off by default. But if you turn checksums back on, you will not be able to roll back to CDH4.1.x because HBase 0.92 will not be able to read the Hfiles.
HBase Bloom Filters
HBase CDH4.2 can produce bloom filters that are not backward compatible. HBase 0.94 created a new block type in an Hfile. This block type is a bloom filter for deletes; it is written into an Hfile whenever there are column-family deletes and bloom filters are turned on. HBase 0.92 does not have this block type and any attempt to have an HBase 0.92 (CDH4.1) Region Server read this file will result in the following error:
java.io.IOException: Invalid HFile block magic: DFBLMET2 at org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:124) at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:135) at org.apache.hadoop.hbase.io.hfile.HFileBlock.<init>(HFileBlock.java:167) at org.apache.hadoop.hbase.io.hfile.HFileBlock.<init>(HFileBlock.java:76) at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderV2.readBlockData(HFileBlock.java:1395) at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader$1.nextBlock(HFileBlock.java:986) at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.<init>(HFileReaderV2.java:131) at org.apache.hadoop.hbase.io.hfile.HFile.pickReaderVersion(HFile.java:426) at org.apache.hadoop.hbase.io.hfile.HFile.createReader(HFile.java:435) at org.apache.hadoop.hbase.regionserver.StoreFile$Reader.<init>(StoreFile.java:1026) at org.apache.hadoop.hbase.regionserver.StoreFile.open(StoreFile.java:485) at org.apache.hadoop.hbase.regionserver.StoreFile.createReader(StoreFile.java:566) at org.apache.hadoop.hbase.regionserver.Store.loadStoreFiles(Store.java:293) at org.apache.hadoop.hbase.regionserver.Store.<init>(Store.java:230) at org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:2534) at org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:454) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3308) at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:3256) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:331) at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:107) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:169) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:680)
Other Incompatible Changes
- Cloudera Manager 3.x uses older HBase client libraries and so is not compatible with CDH4.2 HBase. You should upgrade to Cloudera Manager 4.
- When upgrading from CDH4.1.x to CDH 4.2.x (and later), the following
methods are no longer code-compatible and will throw Exceptions:
- The ZOOKEEPER_CONF environment variable is not automatically included in the HBase classpath. If your applications or scripts (such as CopyTable) depend upon automatically picking up settings from zoo.cfg, you must augment your hbase-site.xml file with your specific ZooKeeper settings.
- HBASE-5228 removes the 'transform' functionality from the REST server.
- HBASE-6553 removes the Avro Gateway.
- Thrift JMX port is now set to 10103 by default. CDH4.2 incorporates HBASE-7277, and so, by default, when the Thrift server is started, it will enable JMX at port 10103 unless HBASE_THRIFT_JMX_OPTS is set differently. This means that you will need to configure a JMX access rule and password if HBASE_THRIFT_JMX_OPTS is not set. Otherwise, the Thrift server will not start. You can configure the JMX access rule and password by means of the environment variable HBASE_JMX_OPTS.
- REST JMX port is now set to 10105 by default. CDH4.2 incorporates HBASE-7274, and so, by default, when the REST server is started, it will enable JMX at port 10105 unless HBASE_REST_JMX_OPTS is set differently. This means that you will need to configure a JMX access rule and password if HBASE_REST_JMX_OPTS is not set. Otherwise, the REST server will not start. You can configure the JMX access rule and password by means of the environment variable HBASE_JMX_OPTS.
BulkLoad Co-processor (CDH4.3 and later)As of CDH 4.3, there is a new secure BulkLoad co-processor. In a secure cluster, you must add the properties to hbase-site.xml as follows; BulkLoad jobs will no longer work with the previous configuration.
<property> <name>hbase.coprocessor.region.classes</name> <value> org.apache.hadoop.hbase.security.token.TokenProvider, org.apache.hadoop.hbase.security.access.AccessController, org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint </value> </property> <property> <name>hbase.bulkload.staging.dir</name> <value>/tmp/hbase-staging</value> </property>Note
:There should be no spaces or line-breaks after the comments in the value field for hbase.coprocessor.region.classes (the snippet above is formatted purely for readability; do not copy and paste it.)This change has the following ramifications for BulkLoad operations:
A CDH4.3 client cannot bulkload to a 4.2 server
A CDH4.2 client can bulkload to a CDH4.3 server
A CDH4.3 client can bulkload to a CDH4.3 server only after the configuration shown above has been done.