0
votes

I have to populate a teradata table from another source where that can be simplify like that:

+------+------+------------+------------+
| Col1 | Col2 |    Col3    |    Col4    |
+------+------+------------+------------+
| 1234 |    0 | 01/01/2009 | 01/04/2019 |
| 1234 |    3 | 01/01/2010 | 01/05/2020 |
| 2345 |    1 | 20/02/2013 | 01/04/2019 |
| 2345 |    0 | 20/02/2013 | 01/04/2018 |
| 2345 |    2 | 31/01/2009 | 01/04/2017 |
| 3456 |    0 | 01/01/2009 | 01/04/2019 |
| 3456 |    1 | 01/01/2015 | 01/04/2019 |
| 3456 |    1 | 01/01/2015 | 01/05/2017 |
| 3456 |    3 | 01/01/2015 | 01/04/2019 |
+------+------+------------+------------+

Col1 is duplicated in source so we have rules to select the right row (with col1 unique in final result) For if value in col1 :

  • If value is duplicated then select the most recent date in Col3
  • If (and only if) it is still duplicated then select row with col2=1
  • If still duplicated then select most recent date in col4.

Considering the the previous table we should get the following result :

+------+------+------------+------------+
| Col1 | Col2 |    Col3    |    Col4    |
+------+------+------------+------------+
| 1234 |    3 | 01/01/2010 | 01/05/2020 |
| 2345 |    1 | 20/02/2013 | 01/04/2019 |
| 3456 |    1 | 01/01/2015 | 01/04/2019 |
+------+------+------------+------------+

I start using partition by to group each value occurrences in col 3 but i have no good idea on how to apply the conditions for each partion in a sql query

Thank you for your help

2
Your third condition is on the same column as the first condition. I assume one should be col4.Gordon Linoff
@GordonLinoff exactly, I have corrected the question.codesensei

2 Answers

1
votes

You can use row_number():

select t.*
from (select t.*,
             row_number() over (partition by col1
                                order by col3 desc,
                                         (case when col2 = 1 then 1 else 2 end),
                                         col4 desc
                               ) as seqnum
      from t
     ) t
where seqnum = 1;
1
votes

You can use QUALIFY in Teradata to simplify the syntax:

SELECT col1, col2, col3, col4
FROM mytable
QUALIFY ROW_NUMBER() OVER(
  PARTITION BY col1 -- Group rows by "col1" values
  ORDER BY col3 DESC, CASE WHEN col2 = 1 THEN 1 ELSE 2 END, col4 DESC -- Order rows
) = 1 -- Get "first" row in each group

Otherwise, this is the same as the answer above.