29
votes

I have a MySql table consisting of daily stock quotes (open, high, low, close and volume) which I'm trying to convert into weekly data on the fly. So far, I have the following function, which works for the highs, lows, and volume:

SELECT MIN(_low), MAX(_high), AVG(_volume),
CONCAT(YEAR(_date), "-", WEEK(_date)) AS myweek
FROM mystockdata
GROUP BY myweek
ORDER BY _date;

I need to select the first instance of _open in the above query. So for example, if there was a holiday on Monday (in a particular week) and stock market opened on Tuesday, _open value should be selected from the Tuesday that's grouped into its week. Similarly, the close value should be the last _close from that week.

Is it possible to select something like FIRST() and LAST() in MySql so that the above could be wrapped up within a single SELECT rather than using nested select queries?

Here's my table's create statement to get an idea of the schema:

delimiter $$
CREATE TABLE `mystockdata` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `symbol_id` int(11) NOT NULL,
  `_open` decimal(11,2) NOT NULL,
  `_high` decimal(11,2) NOT NULL,
  `_low` decimal(11,2) NOT NULL,
  `_close` decimal(11,2) NOT NULL,
  `_volume` bigint(20) NOT NULL,
  `add_date` date NOT NULL,
  PRIMARY KEY (`id`),
  KEY `Symbol_Id` (`symbol_id`,`add_date`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8$$

Update: There are no nulls, wherever there's a holiday/weekend, the table does not carry any record for that date.

4
As you know, FIRST and LAST as you described don't exist in MySQL. It may be possible with one query, but it depends on your table schema… Update your post.Jason McCreary
Thanks, Jason. Please see my edits above.Zishan Danish Neno

4 Answers

48
votes

If you are using MySQL 8, the preferable solution would make use of the window functions FIRST_VALUE() and/or LAST_VALUE(), which are now available. Please have a look at Lukas Eder's answer.

But if you're using an older version of MySQL, those functions are not supported. You have to simulate them using some kind of workarounds, for example you could make use of the aggregated string function GROUP_CONCAT() that creates a set of all _open and _close values of the week ordered by _date for _open and by _date desc for _close, and extracting the first element of the set:

select
  min(_low),
  max(_high),
  avg(_volume),
  concat(year(_date), "-", lpad(week(_date), 2, '0')) AS myweek,
  substring_index(group_concat(cast(_open as CHAR) order by _date), ',', 1 ) as first_open,
  substring_index(group_concat(cast(_close as CHAR) order by _date desc), ',', 1 ) as last_close
from
  mystockdata
group by
  myweek
order by
  myweek
;

Another solution would make use of subqueries with LIMIT 1 in the SELECT clause:

select
  min(_low),
  max(_high),
  avg(_volume),
  concat(year(_date), "-", lpad(week(_date), 2, '0')) AS myweek,
  (
    select _open
    from mystockdata m
    where concat(year(_date), "-", lpad(week(_date), 2, '0'))=myweek
    order by _date
    LIMIT 1
  ) as first_open,
  (
    select _close
    from mystockdata m
    where concat(year(_date), "-", lpad(week(_date), 2, '0'))=myweek
    order by _date desc
    LIMIT 1
  ) as last_close
from
  mystockdata
group by
  myweek
order by
  myweek
;

Please note I added the LPAD() string function to myweek, to make the week number always two digits long, otherwise weeks won't be ordered correctly.

Also be careful when using substring_index in conjunction with group_concat(): if one of the grouped strings contains a comma, the function might not return the expected result.

7
votes

Starting with MySQL 8, you would ideally use window functions for the task:

WITH 
  t1 AS (
    SELECT _low, _high, _volume, CONCAT(YEAR(_date), "-", WEEK(_date)) AS myweek
    FROM mystockdata
  ),
  t2 AS (
    SELECT 
      t1.*, 
      FIRST_VALUE(_open) OVER (PARTITION BY myweek ORDER BY _date) AS first_open,
      FIRST_VALUE(_close) OVER (PARTITION BY myweek ORDER BY _date DESC) AS last_close
    FROM t1
  )
SELECT MIN(_low), MAX(_high), AVG(_volume), myweek, MIN(first_open), MAX(last_close)
FROM t2
GROUP BY myweek
ORDER BY myweek;
2
votes

You will likely need to COALESCE function to get the first value. However, you need to make sure that days without data (weekends and holidays) have a null value for _open on those days without data.

Usage would be:

SELECT MIN(_low), MAX(_high), AVG(_volume), COALESCE(_open)
CONCAT(YEAR(_date), "-", WEEK(_date)) AS myweek
FROM mystockdata
GROUP BY myweek
ORDER BY _date;

For the last() value, I can only think of a pretty hacky solution, which would be to use GROUP_CONCAT and then string manipulation to get the last value from the list. So perhaps something like this:

SELECT MIN(_low), MAX(_high), AVG(_volume), COALESCE(_open), SUBSTRING_INDEX(GROUP_CONCAT(_close), ',', -1)
CONCAT(YEAR(_date), "-", WEEK(_date)) AS myweek
FROM mystockdata
GROUP BY myweek
ORDER BY _date;

Note you could also use the GROUP_CONCAT approach for the first item instead of coalesce if you wanted consistent looking query

SELECT MIN(_low), MAX(_high), AVG(_volume), SUBSTRING_INDEX(GROUP_CONCAT(_open), ',', 1), SUBSTRING_INDEX(GROUP_CONCAT(_close), ',', -1)
CONCAT(YEAR(_date), "-", WEEK(_date)) AS myweek
FROM mystockdata
GROUP BY myweek
ORDER BY _date;

For GROUP_CONCAT to work properly you also need to make sure the dates without values have null in _open and _close fields.

-1
votes

Basically, what you need to do:

  1. group by PRODUCTID
  2. within each group, order by LOCATION
  3. select the FIRST price for the same product as ordered by LOCATION

Putting them together, you can use the following query:

SELECT PRODUCTID, 
   SUBSTRING_INDEX(GROUP_CONCAT(CAST(LOCATION AS CHAR) ORDER BY LOCATION DESC), ',', 1) AS LOCATION,
   SUBSTRING_INDEX(GROUP_CONCAT(CAST(PRICE AS CHAR) ORDER BY LOCATION DESC), ',', 1) AS PRICE
FROM ProductLocation
GROUP BY PRODUCTID;

Note that MySQL does not have FIRST() and LAST() aggregate functions for GROUP BY but such FIRST() AND LAST() can be simulated by using GROUP_CONCAT() and SUBSTRING_INDEX() functions.