4
votes

I have a csv file which have hundreds of columns, when I load the file into Pig, I dont want to assign each column like

A = load 'path/to/file' as (a,b,c,d,e......)

Since I'll filter a lot of them at the second step:

B = foreach A generate $0,$2,....;

But here, can I assign a name and type to each column of B? something like

B = foreach A generate $0,$2,... AS (a:int,b:int,c:float)

I tried the above code but it doesn't work.

Thanks.

1

1 Answers

4
votes

You have to specify them between each comma.

B = foreach A generate $0 as a, $2 as b,...

Note that it just assumes the type that it is already.