0
votes

When I run a major compaction in Apache HBase, it is not deleting rows marked for deletion unless I first perform a total reboot of HBase.

First I delete the row I want and subsequently perform a scan to see that the row I want is marked for deletion:

column=bank:respondent_name, timestamp=1407157745014, type=DeleteColumn                                             
column=bank:respondent_name, timestamp=1407157745014, value=STERLING NATL MTGE CO., INC

Then I run the command major_compact 'myTable' and wait a couple of minutes for the major compaction to finish in the background. Then when I perform the scan again, the row and tombstone marker are still there.

However, if I restart HBase and run another major compaction, the row and tombstone marker disappear. In a nutshell, major_compact only seems to be working properly if I perform a restart of HBase right before I run the major compaction. Any ideas on why this is the case? I would like to see the row and tombstone marker be deleted every time I run a major compaction. Thanks.

2

2 Answers

0
votes

My experience is to flush the table firstly before run major_compact for this table

hbase>flush 'table' hbase>major_compact 'table'

0
votes

Step 1. create table

create 'mytable', 'col1'

Step 2. insert data into table

put 'mytable',1,'col1:name','srihari'

Step 3. Flush the table

flush 'mytable' Observe one file in below location

Location : /hbase/data/default/mytable/*/col1

Repeat the step 2 and 3 one more time and observe the location we can see two files in that location.

Now execute the below command

major_compact 'mytable'

Now we can see only one file in that location.