14
votes

I have two tables which I want to join together using a left outer join. However, even though my left table contains only unique values, the right table satisfies the CONDITION more than once and as such, adds extra rows to the resultset.

Code to replicate problem:

declare @tb1 table (c1 int)
declare @tb2 table (c2 int)

INSERT INTO @tb1 VALUES (1)
INSERT INTO @tb1 VALUES (2)
INSERT INTO @tb1 VALUES (3)
INSERT INTO @tb1 VALUES (4)

INSERT INTO @tb2 VALUES (3)
INSERT INTO @tb2 VALUES (4)
INSERT INTO @tb2 VALUES (5)
INSERT INTO @tb2 VALUES (6)

select * from @tb1 left outer join @tb2 ON c1 = c2

INSERT INTO @tb2 VALUES (3)
INSERT INTO @tb2 VALUES (4)
INSERT INTO @tb2 VALUES (5)
INSERT INTO @tb2 VALUES (6)

select * from @tb1 left outer join @tb2 ON c1 = c2

As you can see the first SELECT returns 4 rows, where the second SELECT 6, although the left table remains unchanged.

How does one stay strict to the left table, and only use the right table to COMPLEMENT the rows from the left table?

HELP!

RESULTS:
c1          c2
----------- -----------
1           NULL
2           NULL
3           3
4           4

[DUPLICATE @tb2 records]

c1          c2
----------- -----------
1           NULL
2           NULL
3           3
3           3
4           4
4           4
5
I have a table with multiple non distinct rows (login logs) my left table is this table grouped by user ID, so it is distinct. the right table is the original table, which contains the data i need. I therefore link the right table to the left table using the USERid, so that I can return useful dataTheofanis Pantelides

5 Answers

17
votes

Sorry, but your thinking is skewed.

Think about it this way: if you only want one single row from tb2 for each row in tb1, which one should the server choose? The fact is that from the definition of a join, every row in the right-hand-side table that matches the left-hand-side row is a match and must be included.

You'll have to ensure tbl2 has distinct values for c2 before the join. Murph's suggestion might do it, provided your SQL variant supports DISTINCT [column] (not all do).

6
votes

Try useing

select DISTINCT * from @tb1 left outer join @tb2 ON c1 = c2
5
votes

If you want to keep just single rows on the left hand side, you'll need to decide what you want to show on the right, for each unique value on the left. If you want to show a count, for example, you could do this:

select b1.c1, x.c from @tb1 b1 
left outer join 
(
  select c2, count(*) as c 
  from @tb2
  group by c2
) as x 
ON b1.c1 = x.c2

or if you just want one occurence of values from c2:

select b1.c1, x.c2 from @tb1 b1 
left outer join 
(
  select c2
  from @tb2
  group by c2
) as x 
ON b1.c1 = x.c2
4
votes

select distinct * from @tb1 left outer join @tb2 ON c1 = c2

4
votes

Hmm, the query is doing what its supposed to since there are duplicate records (or at least duplicate identifiers) in the right hand table.

To get the effect you want something like:

SELECT * FROM @tb1 LEFT OUTER JOIN (SELECT DISTINCT c2 FROM @tb2) t2 ON @tb1.c1 = t2.c2

If that isn't sufficient you'll need to explain the requirement in a bit more detail.