1
votes

I've got a HBase table that is loaded via the HBase Java api like so:

put.add(Bytes.toBytes(HBaseConnection.FAMILY_NAME), Bytes.toBytes("value"), Bytes.toBytes(value));

(Where the variable value is a normal java float.)

I proceed to load this with Pig as follows:

raw = LOAD 'hbase://tableName' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('family:value', '-loadKey true -limit 5') AS (id:chararray, value:float);

However when I dump this with:

dump raw;

I get:

[main] WARN  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 5 time(s).

for each float value. The ID's are printed fine.

Im running:

  • Apache Hadoop 0.20.2.05
  • Pig 0.9.2
  • Hbase 0.92.0

My question: Why cant pig handle theses float values? What am I doing wrong?

2
Try it without the AS (id:chararray, value:float) part. What does it dump? Try converting value to a String before calling Bytes.toBytes on it, just to know what the problem is.Hari Menon
I removed the "as" clause as you suggested but all I get is odd looking Udf-8 characters (since the data is binary).Max Charas

2 Answers

4
votes

Turns out you have to add a caster. Like so:

USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('family:value', '-loadKey true -limit 5 -caster HBaseBinaryConverter')
0
votes

Please try by following way:

test = load '/user/training/user' using PigStorage(',') 
  as (user_id, name, age:int, country, gender);

As default delimiter for loading is tab.