0
votes

During bulk data migration from RDBMS to Hbase, is there any chance that region split happens too often? If it occurs more times then it will surely affect write as well as read performance.

I know pre-splitting may avoid this region splits to some extent.

But in our product design, first we are going to write new data alone in Hbase (may be for 6 months) and once the Hbase system stable for read and write for new data, will start migration of data from RDBMS to Hbase. In this phase, I doubt region split may occur too often as datas are too large and it'll affect both read and write performance.

Our row key will incremental in order per user. For different users, it'll start differently.

Please suggest some solutions to keep performance of the server during Data migration.

1

1 Answers

0
votes

I am a proponent of not pre-splitting hbase. One of the key features of the product is auto-sharding. Splitting is a pretty quick operation but it puts you on the path for compaction. I have found on heap compaction in hbase to behave poorly. At Splice Machine (Open Source), we moved compaction onto Spark and we see very little impact to operations in hbase.