How to get the latest data point in each month in each year?

Question

Here is an example snippet of my dataset. I have multiple data points for each month in each year , over a number of years. I want to get the latest row in each month in each year

Ex:

     data       value
   2019-01-1     10
   2019-01-7     20
   2019-01-28    30
                 
   2019-03-4     50
   2019-03-17    40 
   2019-03-28    20

   2020-04-16    80 
   2020-04-18    60
   2020-04-30    40

   2020-05-16    90 
   2020-05-18    10
   2020-05-30    80

Expected output

data         value
2019-01-28    30
2019-03-28    20
2020-04-30    40
2020-05-30    80

I would like to achieve this with flask-sqlalchemy. I'm thinking I'll have to group by month maybe? I'm quite new to sql queries. Any help would be appreciated!

Edit : I've tried this :

GROUP BY EXTRACT(year FROM date),
EXTRACT(month FROM date)

but this gives me the first row of each month , I want the last row, i.e the latest row

Thank you ! I tried this GROUP BY EXTRACT(year FROM date), EXTRACT(month FROM date) HAVING MAX(date) ORDER BY EXTRACT(YEAR FROM date) but I'm not able to get the last row of the month(Which is the required row). I'm getting the first row of the month. — riddhi athreya

Akina Akina · Accepted Answer · 2021-07-02T08:19:24

The solution for MySQL.

SELECT t1.*
FROM table AS t1
JOIN ( SELECT MAX(data) AS data
       FROM table t2
       GROUP BY DATE_FORMAT(data, '%Y%m') ) t3 USING (data)

The query assimes that data column have no duplicates (is defined as UNIQUE).

GROUP BY expression may be GROUP BY EXTRACT(year FROM data), EXTRACT(month FROM data) as you have tried... or even GROUP BY data DIV 100. It doesn't matter.

How to get the latest data point in each month in each year?

2 Answers