I am having the problem in implementing the field object ( schema) after flatting the string in pig. I have the following code:
Data = load 'data/*.txt' using PigStorage( ) AS(...., date:chararray, .....);
B = foreach Data FLATTEN(REGEX_EXTRACT_ALL(date, '"(.)/(.)/(.*)
(.):(.):(.*)"')) AS (month:int, day:int, year:int, hour:int, min:int, second:int);
--B = filter B by year==2015;
--B = filter B by month ==1 OR month ==2;
C = foreach B generate speed, month, day, year, hour, min;
store C into 'data/out_files' using PigStorage(',');
Where date is in the form ( '2/23/2015 23:56:49')
This works perfectly fine. But when I use filter in B ( year ==2015 or month ==1 OR month ==2), this code does not work. Do you have a good idea how to use any field after flattening String?. Thank you for your help.