0
votes

In BigQuery I am trying to create a view with Sales TY (This Year Financial) and Sales LY in one row. The Sales LY measure should be for the same corresponding week number TY as per the example below.

I am using the following code:

SUM(table_1.sales) OVER (PARTITION BY table_1.client ORDER BY table_1.fw_end_date DESC ROWS BETWEEN 53 FOLLOWING AND 53 FOLLOWING ) as SALES_LY

The problem is that sometimes there are missing weeks which means that the window function will not reflect it because it calculates number of rows not the number of actual weeks.

In the example below week 19 and week 20 are missing in this financial year hence the corresponding results for the sales_ly measure are wrong.

Could it be possible fixed by adding/joining additional rows for the missing dates with NULL values?

row_number  client  fy  wofy    fw_end_date year     sales          sales_LY                
1           1111    2020    34  2020-02-23  TY       74.97971177    75.63215281 
2           1111    2020    33  2020-02-16  TY       42.42109122    68.14894689
3           1111    2020    32  2020-02-09  TY       19.85037174    87.12654065
4           1111    2020    31  2020-02-02  TY       16.56226835    3.122137476
5           1111    2020    30  2020-01-26  TY       1.800185225    10.74736963
6           1111    2020    29  2020-01-19  TY       24.75012318    38.63631908
7           1111    2020    28  2020-01-12  TY       25.25663409    1.387588554
8           1111    2020    27  2020-01-05  TY       72.26309414    0.873563634
9           1111    2020    26  2019-12-29  TY       48.42015566    81.57363107
10          1111    2020    25  2019-12-22  TY       29.94857681    41.80248501
11          1111    2020    24  2019-12-15  TY       60.84110104    26.66495665
12          1111    2020    23  2019-12-08  TY       6.810985936    48.21324987
13          1111    2020    22  2019-12-01  TY       55.25000762    8.681731452
14          1111    2020    21  2019-11-24  TY       7.314435129    61.50365765
15          1111    2020    18  2019-11-03  TY       28.86674749    84.68936948 --wrong     
16          1111    2020    17  2019-10-27  TY       93.06304298            
17          1111    2020    16  2019-10-20  TY       31.18234699            
18          1111    2020    15  2019-10-13  TY       56.40700057            
19          1111    2020    14  2019-10-06  TY       70.7995385         
20          1111    2020    13  2019-09-29  TY       88.80009525            
21          1111    2020    12  2019-09-22  TY       75.12011037            
22          1111    2020    11  2019-09-15  TY       28.54977137            
23          1111    2020    10  2019-09-08  TY       38.05238915            
24          1111    2020    9   2019-09-01  TY       7.256419393            
25          1111    2020    8   2019-08-25  TY       32.81145188            
26          1111    2020    7   2019-08-18  TY       32.04938194            
27          1111    2020    6   2019-08-11  TY       53.890913          
28          1111    2020    5   2019-08-04  TY       17.4527262         
29          1111    2020    4   2019-07-28  TY       66.08187866            
30          1111    2020    3   2019-07-21  TY       9.331124689            
31          1111    2020    2   2019-07-14  TY       61.35972079            
32          1111    2020    1   2019-07-07  TY       68.02729471            
33          1111    2019    52  2019-06-23  LY       65.09319706            
34          1111    2019    51  2019-06-16  LY       46.3647103         
35          1111    2019    50  2019-06-09  LY       45.04742519            
36          1111    2019    49  2019-06-02  LY       10.72003618            
37          1111    2019    48  2019-05-26  LY       69.73143446            
38          1111    2019    47  2019-05-19  LY       8.3106988          
39          1111    2019    46  2019-05-12  LY       18.53940931            
40          1111    2019    45  2019-05-05  LY       78.27473501            
41          1111    2019    44  2019-04-28  LY       4.799286544            
42          1111    2019    43  2019-04-21  LY       94.58933139            
43          1111    2019    42  2019-04-14  LY       27.06059414            
44          1111    2019    41  2019-04-07  LY       53.42375151            
45          1111    2019    40  2019-03-31  LY       22.92189479            
46          1111    2019    39  2019-03-24  LY       95.3449253         
47          1111    2019    38  2019-03-17  LY       62.34159147            
48          1111    2019    37  2019-03-10  LY       73.99278829            
49          1111    2019    36  2019-03-03  LY       39.517017          
50          1111    2019    35  2019-02-24  LY       50.46146309            
51          1111    2019    34  2019-02-17  LY       75.63215281            
52          1111    2019    33  2019-02-10  LY       68.14894689            
53          1111    2019    32  2019-02-03  LY       87.12654065            
54          1111    2019    31  2019-01-27  LY       3.122137476            
55          1111    2019    30  2019-01-20  LY       10.74736963            
56          1111    2019    29  2019-01-13  LY       38.63631908            
57          1111    2019    28  2019-01-06  LY       1.387588554            
58          1111    2019    27  2018-12-30  LY       0.873563634            
59          1111    2019    26  2018-12-23  LY       81.57363107            
60          1111    2019    25  2018-12-16  LY       41.80248501            
61          1111    2019    24  2018-12-09  LY       26.66495665            
62          1111    2019    23  2018-12-02  LY       48.21324987            
63          1111    2019    22  2018-11-25  LY       8.681731452            
64          1111    2019    21  2018-11-18  LY       61.50365765            
65          1111    2019    20  2018-11-11  LY       84.68936948            

1

1 Answers

0
votes

Instead of jumping 53 rows, go and search for the exact 1 year ago date:

WITH data AS (
  SELECT * 
  FROM `fh-bigquery.weather_gsod.all` 
  WHERE date BETWEEN '2018-12-01' AND '2020-02-24'
  AND name LIKE 'SAN FRANCISCO INTERNATIONAL A'
), main_query AS (
  SELECT name, date, temp
    , ARRAY_AGG(STRUCT(date, temp)) OVER(PARTITION BY name ORDER BY date ROWS BETWEEN 366 PRECEDING AND 310 PRECEDING ) over_array
  FROM data a 
)

SELECT * EXCEPT(over_array)
  , (SELECT temp FROM UNNEST(over_array) WHERE date=DATE_SUB(a.date, INTERVAL 1 year)) prev_year
FROM main_query a
ORDER BY name, date DESC 

I did this with days - you can do the same with weeks instead.


Solved for weeks:

WITH data AS (
  SELECT ROUND(AVG(temp),1) temp, DATE_TRUNC(date, week) week, name 
  FROM `fh-bigquery.weather_gsod.all` 
  WHERE date BETWEEN '2018-12-01' AND '2020-02-24'
  AND name LIKE 'SAN FRANCISCO INTERNATIONAL A'
  GROUP BY name, week
), main_query AS (
  SELECT name, week, temp
    , ARRAY_AGG(STRUCT(week, temp)) 
      OVER(PARTITION BY name ORDER BY week 
      ROWS BETWEEN 53 PRECEDING AND 40 PRECEDING ) over_array
  FROM data a 
)

SELECT * EXCEPT(over_array)
  , (SELECT temp FROM UNNEST(over_array) 
     WHERE EXTRACT(YEAR FROM week)+1=EXTRACT(YEAR FROM a.week)
     AND EXTRACT(WEEK FROM week)=EXTRACT(WEEK FROM a.week)) prev_year
FROM main_query a
ORDER BY name, week DESC