0
votes

I wrote a simple test program to insert a row. The only different point from normal HBase Put example programs is that a Put instance and its KeyValue instances are created with a timestamp.

The expected behavior is that a row is inserted. However, in my HBase environment, no row is inserted.

Below is my test program.

import java.io.*;
import java.util.*;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.*;

public class Test
{
    // Names of table, family, qualifier and row ID.
    private static final byte[] TABLE     = Bytes.toBytes("test-table");
    private static final byte[] FAMILY    = Bytes.toBytes("test-family");
    private static final byte[] QUALIFIER = Bytes.toBytes("test-qualifier");
    private static final byte[] ROWID     = Bytes.toBytes("test-rowid");

    /**
     * The entry point of this program.
     *
     * <p>
     * This program assumes that there already exists an HBase
     * table named "test-table" with a column family named
     * "test-family". To create an HBase table satisfying these
     * conditions, type the following at the hbase shell prompt.
     * </p>
     *
     * <pre>
     * hbase&gt; create 'test-table', 'test-family'
     * </pre>
     *
     * <p>
     * This program inserts a row whose row ID is "test-rowid"
     * with a column named "test-family:test-qualifier". The
     * value of the column is the string expression of
     * <code>new Date()</code>.
     * </p>
     */
    public static void main(String[] args) throws Exception
    {
        // Get the table.
        Configuration conf = HBaseConfiguration.create();
        HTable table = new HTable(conf, TABLE);

        // Prepare data to put.
        byte[] value = Bytes.toBytes(new Date().toString());
        Put put = new Put(ROWID);
        put.add(FAMILY, QUALIFIER, value);

        // Clone Put with a timestamp.
        put = clone(put, 10);

        // Put the data.
        table.put(put);

        // Read back the data.
        Get get = new Get(ROWID);
        Result result = table.get(get);

        // Dump the read data.
        System.out.println("DATA = " + result.toString());
    }

    /**
     * Clone the given Put instance with the given timestamp.
     */
    private static Put clone(Put a, long timestamp) throws IOException
    {
        // Create a Put instance with the specified timestamp.
        Put b = new Put(a.getRow(), timestamp);

        Map<byte[], List<KeyValue>> kvs = a.getFamilyMap();

        // Copy KeyValue's from the source Put (a) to
        // the cloned Put (b). Note the given timestamp
        // is used for each new KeyValue instance.
        for (List<KeyValue> kvl : kvs.values())
        {
            for (KeyValue kv : kvl)
            {
                b.add(new KeyValue(
                    kv.getRow(),
                    kv.getFamily(),
                    kv.getQualifier(),
                    timestamp,
                    kv.getValue()));
            }
        }

        return b;
    }
}

The console output generated by this program is as follows.

DATA = keyvalues=NONE

And "scan" at the hbase shell says "0 row(s)".

hbase(main):011:0> scan 'test-table'
ROW                                              COLUMN+CELL
0 row(s) in 0.0080 seconds

Commenting out the code line to clone a Put instance like below,

        // Clone Put with a timestamp.
        //put = clone(put, 10);

that is, using a Put instance created with no timestamp argument changes the behavior of the program. In this case, the console output shows the inserted value,

DATA = keyvalues={test-rowid/test-family:test-qualifier/1344594210281/Put/vlen=28}

and "scan" shows the inserted row.

hbase(main):012:0> scan 'test-table'
ROW                                              COLUMN+CELL
 test-rowid                                      column=test-family:test-qualifier, timestamp=1344594210281, value=Fri Aug 10 19:23:30 JST 2012
1 row(s) in 0.0110 seconds

The logic to clone a Put instance with a timestamp used in my test program is an excerpt from an open source project which is known to work. So, I guess that the root cause of this problem exists in my HBase environment, but I have no clue. My investigation may be insufficient, but I have not seen any error in HBase logs yet.

Could anyone give me any light on this problem, please?

3
Have you try to create a new put with a timestamp instead of cloning the existing one? Also you can specify the timestamp in the add method of the Put class. Just to see if the problem persists, if so it could mean that the problem is in your environment as you suspect.Diego

3 Answers

0
votes

Timestamp column family and column name constitutes a combined key. Here time stamp is UNIX timestamp.

0
votes

KeyValueTestUtil.create can create KeyValue object, and set it to Put

0
votes

I'm not sure this will help, but - I've been there before, so, just trying to help you debug your logic.

First thing I would make sure is that you have never deleted that row before. The way H-Base Delete's work is that it places a tombstone marker at the location in ( Row/KeyValue ) in question with the current timestamp ( unless you specified another one ). Soooo - if you slap a Put after a Delete, and a major compaction has not taken place, you'll never see your Put ... Here is a thread on that : https://issues.apache.org/jira/browse/HBASE-5241 - you can try to execute a "major_compact" from the H-Base shell on that table before performing another test cycle.

..That's my first guess...It is inline with a testing scenario of : Put at current time, perform scan, assert the put operation works ( yes it does - yeah! ), then, Delete current data to reset pallete, perform next Put with smaller timestamp - perform scan - scratch head...

Parting thought - The Get operation always returns the latest version of a KeyValue. So... if in your test you execute a Put with a timestamp of T1 and then later you execute a Put with a timestamp of T2, and T2 < T1, then, when you execute a Get operation, you will get the value associated with T1. This may be initially counter-intuitive, but - it's all good :)

Hope something in there helps your on your journey.....