0
votes

I am trying to write a sql in bigquery and I have a requirement to filter records based on a group by column and another column in the table what I mean is I want to check if the group by column(column name:mnt) has more than one row then I have to check if col2 (col name: zel) value, then I have to apply a filter saying col2 ='X' and only pass that record else pass i.e dont filter the records if the col1 has only distinct one value per group

So I have written a sql to do this I have used row_number as well as rank , dense rank function but I noticed the value of rank and dense rank and row number functions return same value for a group

Please see the below code

#standardsql
with t1 as  (SELECT  mnt, 
case when rank() over (partition by ltrim(rtrim(mnt)) order by 
ltrim(rtrim(mnt)) asc) >1 then 'Y' else 'N' end 
as flag,
rank() over (partition by mnt order by mnt) as rn, 
dense_rank() over (partition by mnt order by mnt) as drn, FROM    
projectname.datasetname.tablename1), 
t2 as ( SELECT 
        mnt,  
        rel, 
        lif, 
        lts, 
lokez FROM projectname.datasetname.tablename2  
WHERE lts <> ""  AND  _PARTITIONTIME = TIMESTAMP(CURRENT_DATE()) ) ,
t3 as  (SELECT  
       lif, 
       lifn,  
       lts,  
       par FROM `projectname.datasetname.tablename3`)  

    ,t4 as (SELECT rcv FROM `projectname.datasetname.tablename4` WHERE mes
    = 'PRO') 
   select * from ( 
   SELECT t1.mnt as mnt,
          t1.flag,
          t1.rn,
          t1.drn
          t2.rel as zel,
          t2.lokez as ZLOEKZ,
          t4.rcv as Zrcv
           FROM t1 left join t2 on replace(t1.mnt, '00000000', '') = 
    REPLACE(t2.mnt, '00000000', '') AND t1.lif = t2.lif and t2.lts <> ""  
    and  
    case when t1.flag = 'Y' and  t2.rel ='X' then 1
         when (t1.flag ='N' and t2.rel=t2.rel) or (t1.flag ='N' and t2.rel 
    is null)  then 1
         when t1.flag = 'Y' and  t2.rel <>'X' then 2
         else 3
         end = 1
    left join t3  ON t1.lif = t3.lif  AND t2.lts = t3.lts AND     
    t3.par = 'BA' left join t4 on t4.rcv = t3.lifn and  t2.lokez is null  ) 
    where     ZLOEKZ is null  order by mnt

As you can see I am using a case statement and even it seems to be not working fine. I am pasting the case condition below again

case when t1.flag = 'Y' and  t2.rel ='X' then 1
             when (t1.flag ='N' and t2.rel=t2.rel) or (t1.flag ='N' and 
      t2.rel 
        is null)  then 1
             when t1.flag = 'Y' and  t2.rel <>'X' then 2
             else 3
             end = 1

But the expected record count did not match so I added the above sql lines to see if my analytical functions were giving me result I wanted

rank() over (partition by mnt order by mnt) as rn, 
dense_rank() over (partition by mnt order by mnt) as drn

strangely for same mnt number the rank , dense rank and row_number function are assigning the same value what am i doing wrong here.

mnt     flag    rn     drn     rel     lokez     rcv
100      N       1      1       X       abc       123
100      N       1      1      null     xyz       123
100      N       1      1      null     def       234

This is my output

I mean as per my code for same mnt number I am seeing flag set to N instead of Y and for the rank and dense rank are giving me same number for all 3 mnt it is generating 1 instead of 123 (note for rank function I understand) but dense rank should not do that

I tried to convey the issue as efficiently as I could please let me know if there is any clarifications I can provide.

any help appreciated

thanks

1
"I tried to convey the issue as efficiently as I could please let me know if there is any clarifications I can provide." -> please, can you provide a query that exhibits the same behavior over public data? I haven't been able to reproduce - Felipe Hoffa
Sure I will and post it here. Thanks - kumarm
@Felipe Hoffa Hello I tried to replicate the issue I was getting using a public dataset and was not able to replicate it everything is working fine now. I think it was a glitch or something anyways count() works in my case thanks every one - kumarm

1 Answers

1
votes
SELECT * EXCEPT(ct) FROM (
  SELECT *, COUNT() OVER(PARTITION BY mnt) AS ct 
) WHERE ct=1 or zel='X'

This is the code snippet for the problem you mentioned. Use this in your code according to the logic.