1
votes

Given a table structure like:

CAT BUSINESS_NR TIME_A VERSION SOME_CODE
ABC 123 2009-02-19T00:00:00 1 OPP
ABC 456 2009-03-18 00:00:00 1 ZUM
ABC 444 2009-03-18 00:00:00 1 ZUM
ABC 444 2009-03-18 00:00:00 2 MUZ
ABC 456 2009-04-18 00:00:00 2 XXX
ABC 456 2009-04-18 00:00:00 3 XXX
ABC 456 2009-04-18 00:00:00 4 UIO
ABC 456 2009-05-18 00:00:00 5 RQA
DEF 637 2018-02-16 00:00:00 1 FAW
DEF 789 2018-02-17 00:00:00 1 WER
SPZ 123 2018-02-16 00:00:00 1 AAA
SPZ 123 2018-02-17 00:00:00 2 BBB
SPZ 123 2018-02-18 00:00:00 3 AAA
SPZ 123 2018-02-19 00:00:00 4 CCC
SPZ 123 2018-02-20 00:00:00 5 AAA
SPZ 123 2018-02-21 00:00:00 6 DDD
SPZ 123 2018-02-22 00:00:00 7 DDD
SPZ 123 2018-02-23 00:00:00 8 DDD
SPZ 123 2018-02-24 00:00:00 9 EEE
SPZ 123 2018-02-25 00:00:00 10 EEE
SPZ 123 2018-02-26 00:00:00 11 DDD
SPZ 123 2018-02-27 00:00:00 12 BBB
SPZ 123 2018-02-26 00:00:00 13 EEE
SPZ 123 2018-02-27 00:00:00 14 EEE
GHI 248 2018-02-17 00:00:00 1 QWE
GHI 248 2019-02-17 00:00:00 2 PPP
GHI 357 2020-02-16 00:00:00 1 FFF
GHI 420 2020-02-16 00:00:00 1 QDS
GHI 357 2020-02-16 00:00:00 2 GGG
GHI 357 2020-02-16 00:00:00 3 LLL
GHI 357 2020-02-16 00:00:00 4 LLL
GHI 357 2020-08-16 00:00:00 4 FFF
GHI 357 2020-10-16 00:00:00 5 ZZZ

... which one can easily create by:

CREATE TABLE blah (CAT, BUSINESS_NR, TIME_A, VERSION, SOME_CODE) AS
SELECT 'ABC',   123,    TIMESTAMP '2009-02-19 00:00:00 UTC',    1,  'OPP' FROM DUAL UNION ALL
SELECT 'ABC',   456,    TIMESTAMP '2009-03-18 00:00:00 UTC',    1,  'ZUM' FROM DUAL UNION ALL
SELECT 'ABC',   444,    TIMESTAMP '2009-03-18 00:00:00 UTC',    1,  'ZUM' FROM DUAL UNION ALL
SELECT 'ABC',   444,    TIMESTAMP '2009-03-18 00:00:00 UTC',    2,  'MUZ' FROM DUAL UNION ALL
SELECT 'ABC',   456,    TIMESTAMP '2009-04-18 00:00:00 UTC',    2,  'XXX' FROM DUAL UNION ALL
SELECT 'ABC',   456,    TIMESTAMP '2009-04-18 00:00:00 UTC',    3,  'XXX' FROM DUAL UNION ALL
SELECT 'ABC',   456,    TIMESTAMP '2009-04-18 00:00:00 UTC',    4,  'UIO' FROM DUAL UNION ALL
SELECT 'ABC',   456,    TIMESTAMP '2009-05-18 00:00:00 UTC',    5,  'RQA' FROM DUAL UNION ALL
SELECT 'DEF',   637,    TIMESTAMP '2018-02-16 00:00:00 UTC',    1,  'FAW' FROM DUAL UNION ALL
SELECT 'DEF',   789,    TIMESTAMP '2018-02-17 00:00:00 UTC',    1,  'WER' FROM DUAL UNION ALL
SELECT 'SPZ',   123,    TIMESTAMP '2018-02-16 00:00:00 UTC',    1,  'AAA' FROM DUAL UNION ALL
SELECT 'SPZ',   123,    TIMESTAMP '2018-02-17 00:00:00 UTC',    2,  'BBB' FROM DUAL UNION ALL
SELECT 'SPZ',   123,    TIMESTAMP '2018-02-18 00:00:00 UTC',    3,  'AAA' FROM DUAL UNION ALL
SELECT 'SPZ',   123,    TIMESTAMP '2018-02-19 00:00:00 UTC',    4,  'CCC' FROM DUAL UNION ALL
SELECT 'SPZ',   123,    TIMESTAMP '2018-02-20 00:00:00 UTC',    5,  'AAA' FROM DUAL UNION ALL
SELECT 'SPZ',   123,    TIMESTAMP '2018-02-21 00:00:00 UTC',    6,  'DDD' FROM DUAL UNION ALL
SELECT 'SPZ',   123,    TIMESTAMP '2018-02-22 00:00:00 UTC',    7,  'DDD' FROM DUAL UNION ALL
SELECT 'SPZ',   123,    TIMESTAMP '2018-02-23 00:00:00 UTC',    8,  'DDD' FROM DUAL UNION ALL
SELECT 'SPZ',   123,    TIMESTAMP '2018-02-24 00:00:00 UTC',    9,  'EEE' FROM DUAL UNION ALL
SELECT 'SPZ',   123,    TIMESTAMP '2018-02-25 00:00:00 UTC',    10, 'EEE' FROM DUAL UNION ALL
SELECT 'SPZ',   123,    TIMESTAMP '2018-02-26 00:00:00 UTC',    11, 'DDD' FROM DUAL UNION ALL
SELECT 'SPZ',   123,    TIMESTAMP '2018-02-27 00:00:00 UTC',    12, 'BBB' FROM DUAL UNION ALL
SELECT 'SPZ',   123,    TIMESTAMP '2018-02-26 00:00:00 UTC',    13, 'EEE' FROM DUAL UNION ALL
SELECT 'SPZ',   123,    TIMESTAMP '2018-02-27 00:00:00 UTC',    14, 'EEE' FROM DUAL UNION ALL
SELECT 'GHI',   248,    TIMESTAMP '2018-02-17 00:00:00 UTC',    1,  'QWE' FROM DUAL UNION ALL
SELECT 'GHI',   248,    TIMESTAMP '2019-02-17 00:00:00 UTC',    2,  'PPP' FROM DUAL UNION ALL
SELECT 'GHI',   357,    TIMESTAMP '2020-02-16 00:00:00 UTC',    1,  'FFF' FROM DUAL UNION ALL
SELECT 'GHI',   420,    TIMESTAMP '2020-02-16 00:00:00 UTC',    1,  'QDS' FROM DUAL UNION ALL
SELECT 'GHI',   357,    TIMESTAMP '2020-02-16 00:00:00 UTC',    2,  'GGG' FROM DUAL UNION ALL
SELECT 'GHI',   357,    TIMESTAMP '2020-02-16 00:00:00 UTC',    3,  'LLL' FROM DUAL UNION ALL
SELECT 'GHI',   357,    TIMESTAMP '2020-02-16 00:00:00 UTC',    4,  'LLL' FROM DUAL UNION ALL
SELECT 'GHI',   357,    TIMESTAMP '2020-08-16 00:00:00 UTC',    4,  'FFF' FROM DUAL UNION ALL
SELECT 'GHI',   357,    TIMESTAMP '2020-10-16 00:00:00 UTC',    5,  'ZZZ' FROM DUAL

... I want to count each change of SOME_CODE for each CAT per month. A change means a change of SOME_CODE within a Business_nr.

I started with the following approach:

select cat, moenat, sum(counta) from (
select 
cat, 
trunc(TIME_A, 'MON') as moenat, 
count(distinct SOME_CODE) as counta
from blah
group by cat, BUSINESS_NR, trunc(TIME_A, 'MON')
) z
group by cat, moenat
order by cat, moenat

... which outputs almost what I want:

CAT MONTH sum(counta)
SPZ FEB 2018 5
... ... ...

But I need all real changes AND the condition: filter out duplicates that are coming one after the other, like I can demonstrate by CAT='SPZ':

CAT BUSINESS_NR TIME_A VERSION SOME_CODE
SPZ 123 2018-02-21 00:00:00 6 DDD
SPZ 123 2018-02-22 00:00:00 7 DDD
SPZ 123 2018-02-23 00:00:00 8 DDD

and so on.

So for SPZ: versions 6 to 8, 9-10 and 13-14 should count each as one (not three or two times) because there was no effective change going on.

That means the expected query output (in FEB 2018) for CAT='SPZ' therefore must be like:

CAT MONTH sum(counta)
SPZ FEB 2018 10
... ... ...

... for 10 times there was an effective change of the code (starting value with version 1 'AAA' must count too, because each record indicates a change).

FYI: maybe parts of the solution (MATCH_RECOGNIZE) of this topic could help: SQL query for counting changes of values in a column but I am not sure.

Hint: order is defined by higher version number as well as time_a information.

Thanks for any new appoach.

1
Yes, MATCH_RECOGNIZE might work well; see asktom.oracle.com/pls/apex/…Mark Stewart
Your example for SPZ is clear, but the business nr is same for all records. What output would you want to see for GHI ? If there is only 1 row, for a business nr/month combo, does that count as a change ? Do you look within a business nr but across months ?Koen Lostrie

1 Answers

2
votes

You can use the LAG function to look at the previous version to see if some_code matches the current version, then SUM to get the total of all the code changes.

  SELECT cat, moenat, SUM (version_change) AS total_version_changes
    FROM (SELECT cat,
                 TRUNC (TIME_A, 'MON')    AS moenat,
                 CASE
                     WHEN    some_code <>
                             LAG (some_code) OVER (PARTITION BY cat ORDER BY version, time_a)
                          OR LAG (some_code) OVER (PARTITION BY cat ORDER BY version, time_a) IS NULL --Needed for version 1
                     THEN
                         1
                     ELSE
                         0
                 END                      AS version_change
            FROM blah)
GROUP BY cat, moenat
ORDER BY cat, moenat;

   CAT       MOENAT    TOTAL_VERSION_CHANGES
______ ____________ ________________________
ABC    01-FEB-09                           1
ABC    01-MAR-09                           2
ABC    01-APR-09                           2
ABC    01-MAY-09                           1
DEF    01-FEB-18                           2
GHI    01-FEB-18                           1
GHI    01-FEB-19                           1
GHI    01-FEB-20                           4
GHI    01-AUG-20                           1
GHI    01-OCT-20                           1
SPZ    01-FEB-18                          10