0
votes

I would like a query that would show the most complete matching records as below. I am using MS Access though an ANSI SQL or Oracle answer would be ok if no access solution.

  • col2 is the only non-nullable column, but as you can see by the result, there may be several records with same value in col2 returned
  • As you can see, nulls are ignored in coalescing records together to make result.

Table1

col1    col2    col3    col4
---------------------------------
c       d               g       
c       d        
        d               g
d       e       g       
d       e        
        e       g
e       e       g       
e       e        
        e       g
e       f
        g       h

Result of query

col1    col2    col3    col4
---------------------------------
c       d               g
d       e       g       
e       e       g       
e       f
        g       h               

I thank you for trying but after further investigation, I found a bug with both proposed solutions using "join" and "not exists".

bug in proposed "join" solution There is a bug as shown below where two rows became 4 in the proposed


col1 col2 col3 col4

    d       g       i                
    d       h       j

result of query

col1 col2 col3 col4

    d       g       i                
    d       g       j
    d       h       i                
    d       h       j

bug in proposed "not exists" solution

After fixing the query by adding a closing parentheses I was able to run the "not exists" solution. Two rows that match on col2 but don't match on col3 are combined into one row. They have different data and should be unique

col1 col2 col3 col4 --------------------------------- d g i d h result of query col1 col2 col3 col4 --------------------------------- d g i
1
In your examples, the returned row is always one of the original rows. Is this always the case?Gordon Linoff
By "most complete" could you also say "row with the most columns containing data values"?Jacob H
Gordon- it doesn't have to be one of the original rows. Jacob H- You could say "row with the most columns containing data values"M In

1 Answers

0
votes

For your sample data, this should work:

select t1.col1, t2.col2, t3.col3, t4.col4
from (((select distinct col2 from t) as t2 left join
       (select distinct col2, col1 from t where col1 is not null) as t1
       on t1.col2 = t2.col2
      ) left join
      (select distinct col2, col3 from t where col3 is not null) as t3
      on t3.col2 = t2.col2
     ) left join
     (select distinct col2, col4 from t where col4 is not null) as t4
     on t4.col2 = t2.col2;

If the rows in the result set are always in the original data, then not exists works:

select t.*
from t
where not exists (select 1
                  from t as t2
                  where t2.col2 = t.col2 and
                        (t2.col1 is not null and t.col1 is null or
                         t2.col3 is not null and t.col3 is null or
                         t2.col4 is not null and t.col4 is null
                 );