1
votes

I want to write result to single table by selecting columns from multiple tables. I want to achieve something like below in hive.

SELECT Table0.num,Table1.field1, Table2.field2, Table3.field3, Table4.field4
FROM Table0
FULL OUTER JOIN Table1 ON Table0.num = Table1.num
FULL OUTER JOIN Table2 ON Table0.num = Table2.num
FULL OUTER JOIN Table3 ON Table0.num = Table3.num
FULL OUTER JOIN Table4 ON Table0.num = Table4.num

Please suggest how should I achieve this in hive?

When I am trying to execute above query in hive I am getting following exception.

FAILED: ParseException line 2:11 cannot recognize input near '' 'on' 'Table0' in select expression

2

2 Answers

1
votes

If I create a bunch of tables like this:

create table Table0(num int, Field0 int);
create table Table1(num int, Field1 int);
create table Table2(num int, Field2 int);
create table Table3(num int, Field3 int);
create table Table4(num int, Field4 int);

Then run the query you post it works just fine.

OK
Time taken: 38.26 seconds

What version of hive are you running? Is it possible the query was modified before posting on stackoverflow?

0
votes

You don't need to use GROUP BY if you are just selecting columns from multiple tables without aggregating. You could also use SELECT DISTINCT (instead of the GROUP BY) if you don't want to have duplicate rows.