I'm storing a Hive table externally, and it's a pretty simple data structure. The table is created in Hive as
(user string, names array<string>)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY '\001'
STORED AS TEXTFILE
(I've tried other delimiters, too).
In Pig, I can't seem to figure out the right way to use a bag or tuple to just load a simple array! Here's what I've tried without luck:
users = load '<file>' using PigStorage() AS (user:chararray, names:bag{tuple(name:chararray)})
users = load '<file>' using PigStorage() AS (user:chararray, names:chararray)
and some other things, but the best I've gotten was to have them loaded as a single string with the delimiter removed (which doesn't help). How do I just load a variable-length array of strings?
thanks