I have to populate a teradata table from another source where that can be simplify like that:
+------+------+------------+------------+
| Col1 | Col2 | Col3 | Col4 |
+------+------+------------+------------+
| 1234 | 0 | 01/01/2009 | 01/04/2019 |
| 1234 | 3 | 01/01/2010 | 01/05/2020 |
| 2345 | 1 | 20/02/2013 | 01/04/2019 |
| 2345 | 0 | 20/02/2013 | 01/04/2018 |
| 2345 | 2 | 31/01/2009 | 01/04/2017 |
| 3456 | 0 | 01/01/2009 | 01/04/2019 |
| 3456 | 1 | 01/01/2015 | 01/04/2019 |
| 3456 | 1 | 01/01/2015 | 01/05/2017 |
| 3456 | 3 | 01/01/2015 | 01/04/2019 |
+------+------+------------+------------+
Col1 is duplicated in source so we have rules to select the right row (with col1 unique in final result) For if value in col1 :
- If value is duplicated then select the most recent date in Col3
- If (and only if) it is still duplicated then select row with col2=1
- If still duplicated then select most recent date in col4.
Considering the the previous table we should get the following result :
+------+------+------------+------------+
| Col1 | Col2 | Col3 | Col4 |
+------+------+------------+------------+
| 1234 | 3 | 01/01/2010 | 01/05/2020 |
| 2345 | 1 | 20/02/2013 | 01/04/2019 |
| 3456 | 1 | 01/01/2015 | 01/04/2019 |
+------+------+------------+------------+
I start using partition by to group each value occurrences in col 3 but i have no good idea on how to apply the conditions for each partion in a sql query
Thank you for your help
col4
. – Gordon Linoff