Given a table structure like:
CAT | BUSINESS_NR | TIME_A | VERSION | SOME_CODE |
---|---|---|---|---|
ABC | 123 | 2009-02-19T00:00:00 | 1 | OPP |
ABC | 456 | 2009-03-18 00:00:00 | 1 | ZUM |
ABC | 444 | 2009-03-18 00:00:00 | 1 | ZUM |
ABC | 444 | 2009-03-18 00:00:00 | 2 | MUZ |
ABC | 456 | 2009-04-18 00:00:00 | 2 | XXX |
ABC | 456 | 2009-04-18 00:00:00 | 3 | XXX |
ABC | 456 | 2009-04-18 00:00:00 | 4 | UIO |
ABC | 456 | 2009-05-18 00:00:00 | 5 | RQA |
DEF | 637 | 2018-02-16 00:00:00 | 1 | FAW |
DEF | 789 | 2018-02-17 00:00:00 | 1 | WER |
SPZ | 123 | 2018-02-16 00:00:00 | 1 | AAA |
SPZ | 123 | 2018-02-17 00:00:00 | 2 | BBB |
SPZ | 123 | 2018-02-18 00:00:00 | 3 | AAA |
SPZ | 123 | 2018-02-19 00:00:00 | 4 | CCC |
SPZ | 123 | 2018-02-20 00:00:00 | 5 | AAA |
SPZ | 123 | 2018-02-21 00:00:00 | 6 | DDD |
SPZ | 123 | 2018-02-22 00:00:00 | 7 | DDD |
SPZ | 123 | 2018-02-23 00:00:00 | 8 | DDD |
SPZ | 123 | 2018-02-24 00:00:00 | 9 | EEE |
SPZ | 123 | 2018-02-25 00:00:00 | 10 | EEE |
SPZ | 123 | 2018-02-26 00:00:00 | 11 | DDD |
SPZ | 123 | 2018-02-27 00:00:00 | 12 | BBB |
SPZ | 123 | 2018-02-26 00:00:00 | 13 | EEE |
SPZ | 123 | 2018-02-27 00:00:00 | 14 | EEE |
GHI | 248 | 2018-02-17 00:00:00 | 1 | QWE |
GHI | 248 | 2019-02-17 00:00:00 | 2 | PPP |
GHI | 357 | 2020-02-16 00:00:00 | 1 | FFF |
GHI | 420 | 2020-02-16 00:00:00 | 1 | QDS |
GHI | 357 | 2020-02-16 00:00:00 | 2 | GGG |
GHI | 357 | 2020-02-16 00:00:00 | 3 | LLL |
GHI | 357 | 2020-02-16 00:00:00 | 4 | LLL |
GHI | 357 | 2020-08-16 00:00:00 | 4 | FFF |
GHI | 357 | 2020-10-16 00:00:00 | 5 | ZZZ |
... which one can easily create by:
CREATE TABLE blah (CAT, BUSINESS_NR, TIME_A, VERSION, SOME_CODE) AS
SELECT 'ABC', 123, TIMESTAMP '2009-02-19 00:00:00 UTC', 1, 'OPP' FROM DUAL UNION ALL
SELECT 'ABC', 456, TIMESTAMP '2009-03-18 00:00:00 UTC', 1, 'ZUM' FROM DUAL UNION ALL
SELECT 'ABC', 444, TIMESTAMP '2009-03-18 00:00:00 UTC', 1, 'ZUM' FROM DUAL UNION ALL
SELECT 'ABC', 444, TIMESTAMP '2009-03-18 00:00:00 UTC', 2, 'MUZ' FROM DUAL UNION ALL
SELECT 'ABC', 456, TIMESTAMP '2009-04-18 00:00:00 UTC', 2, 'XXX' FROM DUAL UNION ALL
SELECT 'ABC', 456, TIMESTAMP '2009-04-18 00:00:00 UTC', 3, 'XXX' FROM DUAL UNION ALL
SELECT 'ABC', 456, TIMESTAMP '2009-04-18 00:00:00 UTC', 4, 'UIO' FROM DUAL UNION ALL
SELECT 'ABC', 456, TIMESTAMP '2009-05-18 00:00:00 UTC', 5, 'RQA' FROM DUAL UNION ALL
SELECT 'DEF', 637, TIMESTAMP '2018-02-16 00:00:00 UTC', 1, 'FAW' FROM DUAL UNION ALL
SELECT 'DEF', 789, TIMESTAMP '2018-02-17 00:00:00 UTC', 1, 'WER' FROM DUAL UNION ALL
SELECT 'SPZ', 123, TIMESTAMP '2018-02-16 00:00:00 UTC', 1, 'AAA' FROM DUAL UNION ALL
SELECT 'SPZ', 123, TIMESTAMP '2018-02-17 00:00:00 UTC', 2, 'BBB' FROM DUAL UNION ALL
SELECT 'SPZ', 123, TIMESTAMP '2018-02-18 00:00:00 UTC', 3, 'AAA' FROM DUAL UNION ALL
SELECT 'SPZ', 123, TIMESTAMP '2018-02-19 00:00:00 UTC', 4, 'CCC' FROM DUAL UNION ALL
SELECT 'SPZ', 123, TIMESTAMP '2018-02-20 00:00:00 UTC', 5, 'AAA' FROM DUAL UNION ALL
SELECT 'SPZ', 123, TIMESTAMP '2018-02-21 00:00:00 UTC', 6, 'DDD' FROM DUAL UNION ALL
SELECT 'SPZ', 123, TIMESTAMP '2018-02-22 00:00:00 UTC', 7, 'DDD' FROM DUAL UNION ALL
SELECT 'SPZ', 123, TIMESTAMP '2018-02-23 00:00:00 UTC', 8, 'DDD' FROM DUAL UNION ALL
SELECT 'SPZ', 123, TIMESTAMP '2018-02-24 00:00:00 UTC', 9, 'EEE' FROM DUAL UNION ALL
SELECT 'SPZ', 123, TIMESTAMP '2018-02-25 00:00:00 UTC', 10, 'EEE' FROM DUAL UNION ALL
SELECT 'SPZ', 123, TIMESTAMP '2018-02-26 00:00:00 UTC', 11, 'DDD' FROM DUAL UNION ALL
SELECT 'SPZ', 123, TIMESTAMP '2018-02-27 00:00:00 UTC', 12, 'BBB' FROM DUAL UNION ALL
SELECT 'SPZ', 123, TIMESTAMP '2018-02-26 00:00:00 UTC', 13, 'EEE' FROM DUAL UNION ALL
SELECT 'SPZ', 123, TIMESTAMP '2018-02-27 00:00:00 UTC', 14, 'EEE' FROM DUAL UNION ALL
SELECT 'GHI', 248, TIMESTAMP '2018-02-17 00:00:00 UTC', 1, 'QWE' FROM DUAL UNION ALL
SELECT 'GHI', 248, TIMESTAMP '2019-02-17 00:00:00 UTC', 2, 'PPP' FROM DUAL UNION ALL
SELECT 'GHI', 357, TIMESTAMP '2020-02-16 00:00:00 UTC', 1, 'FFF' FROM DUAL UNION ALL
SELECT 'GHI', 420, TIMESTAMP '2020-02-16 00:00:00 UTC', 1, 'QDS' FROM DUAL UNION ALL
SELECT 'GHI', 357, TIMESTAMP '2020-02-16 00:00:00 UTC', 2, 'GGG' FROM DUAL UNION ALL
SELECT 'GHI', 357, TIMESTAMP '2020-02-16 00:00:00 UTC', 3, 'LLL' FROM DUAL UNION ALL
SELECT 'GHI', 357, TIMESTAMP '2020-02-16 00:00:00 UTC', 4, 'LLL' FROM DUAL UNION ALL
SELECT 'GHI', 357, TIMESTAMP '2020-08-16 00:00:00 UTC', 4, 'FFF' FROM DUAL UNION ALL
SELECT 'GHI', 357, TIMESTAMP '2020-10-16 00:00:00 UTC', 5, 'ZZZ' FROM DUAL
... I want to count each change of SOME_CODE for each CAT per month. A change means a change of SOME_CODE within a Business_nr.
I started with the following approach:
select cat, moenat, sum(counta) from (
select
cat,
trunc(TIME_A, 'MON') as moenat,
count(distinct SOME_CODE) as counta
from blah
group by cat, BUSINESS_NR, trunc(TIME_A, 'MON')
) z
group by cat, moenat
order by cat, moenat
... which outputs almost what I want:
CAT | MONTH | sum(counta) |
---|---|---|
SPZ | FEB 2018 | 5 |
... | ... | ... |
But I need all real changes AND the condition: filter out duplicates that are coming one after the other, like I can demonstrate by CAT='SPZ':
CAT | BUSINESS_NR | TIME_A | VERSION | SOME_CODE |
---|---|---|---|---|
SPZ | 123 | 2018-02-21 00:00:00 | 6 | DDD |
SPZ | 123 | 2018-02-22 00:00:00 | 7 | DDD |
SPZ | 123 | 2018-02-23 00:00:00 | 8 | DDD |
and so on.
So for SPZ: versions 6 to 8, 9-10 and 13-14 should count each as one (not three or two times) because there was no effective change going on.
That means the expected query output (in FEB 2018) for CAT='SPZ' therefore must be like:
CAT | MONTH | sum(counta) |
---|---|---|
SPZ | FEB 2018 | 10 |
... | ... | ... |
... for 10 times there was an effective change of the code (starting value with version 1 'AAA' must count too, because each record indicates a change).
FYI: maybe parts of the solution (MATCH_RECOGNIZE) of this topic could help: SQL query for counting changes of values in a column but I am not sure.
Hint: order is defined by higher version number as well as time_a information.
Thanks for any new appoach.