HBase Performance Tuning
Column Family
Restrict to 2 o 3 column family
Flushing and compactions are done per region basis i.e. if one column family is carrying the bulk of the data bringing on flushes, the adjacent families will also be flushed though the amount of data they carry is small.
query one column family or the other but usually not both at the one time.
Row-Key Design
keep column family, column name as short as possible
useful shorter rowkey
Constants
Instead of
Get get = new Get(rowkey);
Result r = htable.get(get);
byte[] b = r.getValue(Bytes.toBytes("cf"), Bytes.toBytes("attr")); // returns current version of valueuse
public static final byte[] CF = "cf".getBytes();
public static final byte[] ATTR = "attr".getBytes();
...
Get get = new Get(rowkey);
Result r = htable.get(get);
byte[] b = r.getValue(CF, ATTR); // returns current version of valueWriting to HBase
Use pre-split regions
AutoFlush - When performing a lot of Puts, make sure that setAutoFlush is set to false on your HTable instance. Otherwise, the Puts will be sent one at a time to the RegionServer. Puts added via
htable.add(Put)andhtable.add( <List> Put)wind up in the same write buffer. IfautoFlush = false, these messages are not sent until the write-buffer is filled. To explicitly flush the messages, callflushCommits. Callingcloseon theHTableinstance will invokeflushCommits.
Last updated
Was this helpful?