0
votes

I have to do an self join on a table. I am trying to return a list of several columns to see how many of each type of drug test was performed on same day (MM/DD/YYYY) in which there were at least two tests done and at least one of which resulted in a result code of 'UN'.

I am joining other tables to get the information as below. The problem is I do not quite understand how to exclude someone who has a single result row in which they did have a 'UN' result on a day but did not have any other tests that day.

Query Results (Columns)

County, DrugTestID, ID, Name, CollectionDate, DrugTestType, Results, Count(DrugTestType)

I have several rows for ID 12345 which are correct. But ID 12346 is a single row of which is showing they had a row result of count (1). They had a result of 'UN' on this day but they did not have any other tests that day. I want to exclude this.

I tried the following query

select  
    c.desc as 'County',
    dt.pid as 'PID',
    dt.id as 'DrugTestID',
    p.id as 'ID',
    bio.FullName as 'Participant',
    CONVERT(varchar, dt.CollectionDate, 101) as 'CollectionDate',       
    dtt.desc as 'Drug Test Type',
    dt.result as Result,
    COUNT(dt.dru_drug_test_type) as 'Count Of Test Type'
from    
    dbo.Test as dt with (nolock)
       join dbo.History as h on dt.pid = h.id 
       join dbo.Participant as p on h.pid = p.id
       join BioData as bio on bio.id = p.id
       join County as c with (nolock) on p.CountyCode = c.code
       join DrugTestType as dtt with (nolock) on dt.DrugTestType = dtt.code
       inner join 
          (
             select distinct 
                dt2.pid,
                CONVERT(varchar, dt2.CollectionDate, 101) as 'CollectionDate'
             from   
           dbo.DrugTest as dt2 with (nolock)
                   join dbo.History as h2 on dt2.pid = h2.id 
                   join dbo.Participant as p2 on h2.pid = p2.id
             where
                dt2.result = 'UN'
                and dt2.CollectionDate between '11-01-2011' and '10-31-2012'
                and p2.DrugCourtType = 'AD'
          ) as derived
            on dt.pid = derived.pid 
            and convert(varchar, dt.CollectionDate, 101) = convert(varchar, derived.CollectionDate, 101)
group by 
   c.desc, dt.pid, p.id, dt.id, bio.fullname, dt.CollectionDate, dtt.desc, dt.result
order by 
    c.desc ASC, Participant ASC, dt.CollectionDate ASC
2

2 Answers

0
votes

This is a little complicated because the your query has a separate row for each test. You need to use window/analytic functions to get the information you want. These allow you to do calculate aggregation functions, but to put the values on each line.

The following query starts with your query. It then calculates the number of UN results on each date for each participant and the total number of tests. It applies the appropriate filter to get what you want:

with base as (<your query here>)
select b.*
from (select b.*,
             sum(isUN) over (partition by Participant, CollectionDate) as NumUNs,
             count(*) over (partition by Partitipant, CollectionDate) as NumTests
      from (select b.*,
                  (case when result = 'UN' then 1 else 0 end) as IsUN
            from base
           ) b
     ) b
where NumUNs <> 1 or NumTests <> 1

Without the with clause or window functions, you can create a particularly ugly query to do the same thing:

select b.*
from (<your query>) b join
     (select Participant, CollectionDate, count(*) as NumTests,
             sum(case when result = 'UN' then 1 else 0 end) as NumUNs
      from (<your query>) b
      group by Participant, CollectionDate
     ) bsum
     on b.Participant = bsum.Participant and
        b.CollectionDate = bsum.CollectionDate
where NumUNs <> 1 or NumTests <> 1
0
votes

If I understand the problem, the basic pattern for this sort of query is simply to include negating or exclusionary conditions in your join. I.E., self-join where columnA matches, but columns B and C do not:

select
  [columns]

from
  table t1
  join table t2 on (
    t1.NonPkId = t2.NonPkId
    and t1.PkId != t2.PkId
    and t1.category != t2.category
  )

Put the conditions in the WHERE clause if it benchmarks better:

select
  [columns]

from
  table t1
  join table t2 on (
    t1.NonPkId = t2.NonPkId
  )

where
    t1.PkId != t2.PkId
    and t1.category != t2.category

And it's often easiest to start with the self-join, treating it as a "base table" on which to join all related information:

select
  [columns]

from
  (select
    [columns]

  from
    table t1
    join table t2 on (
      t1.NonPkId = t2.NonPkId
    )

  where
      t1.PkId != t2.PkId
      and t1.category != t2.category
  ) bt

  join [othertable] on (<whatever>)
  join [othertable] on (<whatever>)
  join [othertable] on (<whatever>)

This can allow you to focus on getting that self-join right, without interference from other tables.