3
votes

We have some tables configured with DelimitedKeyPrefixRegionSplitPolicy (inherits from IncreasingToUpperBoundRegionSplitPolicy), a memstore flush size of 128M and a table MAX_FILESIZE of ~20GB.

Based on our calculations we shouldn't get more than 5 regions per server until the region sizes reach 20GB (4^3*256M = 16GB) but we have 7-15 regions per region server.

We had previously combined them to get the correct number but then they just split again. We are using hbase 0.98.4 and table description shows {TABLE_ATTRIBUTES => {MAX_FILESIZE => '21474836480'... The default region max filesize is 1 GB and many of the regions are larger than that.

We can't figure out why they keep splitting despite our best efforts to keep to region count smaller. Any ideas?

1
We tried changing the max_filesize to 100GB as shown here hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/… and we are still getting splitsbridiver

1 Answers

0
votes

You should change the split policy

METADATA => {'SPLIT_POLICY' => 'org.apache.hadoop.hbase.regionserver.ConstantSizeRegionSplitPolicy'}