2
votes

I have two tables Tables1 with ID, name, and Table2 has ID1, ID2, and ID3, name1, name2, and name3. I want to select table1.ID not exists in tables2: ID1, ID2, and ID3

select T1.ID,t1.name
from table1 t1
where  not exists (
   SELECT *
   FROM table2 t2
where t1.ID=t2.ID1  or t1.ID=t2.ID2 or or t1.ID=t2.ID3  )

I get error message for this query

3

3 Answers

0
votes

Join first by ID1, then resulted dataset join by ID2, then resulted dataset join by ID3:

select  p2.ID, p2.name       --pass3
 from
  (select p1.ID, p1.name     --pass2
     from
      (SELECT T1.ID, T1.name --pass1
         FROM Table1 T1
         LEFT JOIN Table2 T2 ON T2.ID1 = T1.ID    
        where T2.ID1 is null                        --not in ID1
      ) p1 LEFT JOIN Table2 T2 ON T2.ID2 = p1.ID 
    where T2.ID1 is null                            --also not in ID2
  ) p2 LEFT JOIN Table2 T2 ON T2.ID3 = p2.ID
where T2.ID1 is null                                --also not in ID3

Joins on 2 and 3 steps will receive already reduced dataset from T1 and this solution may be good for big tables.

0
votes

After a little research, I could find this. Basically, you can't have multiple columns for a subquery in a IN or a NOT IN condition in the WHERE clause. This is why your query is currently failing : Your subquery get all the columns from Table2.


From my understanding of your question, you want a select where the results would be the elements that are not existing in Table2.

To do this, you can simply use a LEFT OUTER JOIN. In SQL, I would left join on all three columns, but it seems like Hive does not support multiple conditions in JOIN statements, so you can use the following alternative :

SELECT T1.ID, T1.name
    FROM Table1 T1
    LEFT JOIN Table2 T2_1 ON T2_1.ID1 = T1.ID
    LEFT JOIN Table2 T2_2 ON T2_2.ID2 = T1.ID
    WHERE (T2_1.id IS NULL) AND                  -- the id of Table2 - T2_1    
          (T2_2.id IS NULL)                      -- the id of Table2 - T2_2

Just add as many LEFT JOIN and conditions in the WHERE clause as you have columns to check.

Here's a fiddle with this query concept (the data are not the same, but the concept is).

-1
votes
SELECT DISTINCT ID,NAME
FROM
(SELECT T1.ID, T1.name
    FROM Table1 T1
    LEFT OUTER JOIN Table2 T2 ON T2.ID1 = T1.ID
where T2.ID1 is null
union
SELECT T1.ID, T1.name
    FROM Table1 T1
    LEFT OUTER JOIN Table2 T2 ON T2.ID2 = T1.ID
where T2.ID2 is null
union
SELECT T1.ID, T1.name
    FROM Table1 T1
    LEFT OUTER JOIN Table2 T2 ON T2.ID3 = T1.ID
where T2.ID3 is null)JO