1
votes

I have two tables with identical columns, I want to find out records which are present in first table but not present in second table. The key between two tables is composite of three columns. I am writing a Hive query like below:

*

Select a.x,b.y from table_1 a left outer join table_2 b on
    a.c1=b.c1 and a.c2=b.c2 and a.c3=b.c3
    where isnull(b.c1) or isnull(b.c2) or isnull(b.c3);

*

Is this query correct for the purpose ? If there are 100 records in table_1 and out of which 50 match with table_2 will the result have remaining 50 rows from table_1 or more, as I am taking join on more than one attribute and using 'OR' in where condition.

1

1 Answers

1
votes

Assuming the columns are not null, then you would normally do a single comparison:

Select a.x, b.y
from table_1 a left outer join
     table_2 b
     on a.c1 = b.c1 and a.c2 = b.c2 and a.c3 = b.c3
where b.c1 is null;

However, returning b.y is rather useless. You know the value is NULL.