1
votes

I am implementing a java program that reads parquet files and loads the data into a HBase table. The table is splitted into 5 regions named (‘a’, ‘f’, ‘k’, ‘p’, ‘u’). and the rowkeys will be like the following format: a-xxxxxx, f-xxxxxx ... where xxxxxx is a random 6-character string. However, when I list the table regions, I find that all data are stored in only one region despite the variety of the rowkeys prefixes.

Here is the part of the code where I create the table and its regions:

HTableDescriptor htable = new HTableDescriptor(tabname);
    htable.addFamily(new HColumnDescriptor(COL_FAMILY));
    if (hbaseAdmin.tableExists(tabname)) {
        hbaseAdmin.disableTable(tabname);
        hbaseAdmin.deleteTable(tabname);
    }
    byte[][] splits = new byte[][] {
            Bytes.toBytes('a'),
            Bytes.toBytes('f'),
            Bytes.toBytes('k'),
            Bytes.toBytes('p'),
            Bytes.toBytes('u')
    };
    hbaseAdmin.createTable(htable, splits);

But after inserting some data, when I list the table regions I get the following output from HBase shell. HBase output of list_regions command

Any help will be appreciated! Thank you all!

1
I resolved the problem: the problem was caused by the regions splitting definition, the solution is replacing the chars by strings like follows: byte[][] splits = new byte[][] { Bytes.toBytes("a"), Bytes.toBytes("f"), Bytes.toBytes("k"), Bytes.toBytes("p"), Bytes.toBytes("u") };rafik_bougacha

1 Answers

1
votes

I resolved the problem: the problem was caused by the regions splitting definition, the solution is replacing the chars by strings like follows: byte[][] splits = new byte[][] { Bytes.toBytes("a"), Bytes.toBytes("f"), Bytes.toBytes("k"), Bytes.toBytes("p"), Bytes.toBytes("u") };