I have a table of bitcoin transactions:
CREATE TABLE `transactions` ( `trans_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `trans_exchange` int(10) unsigned DEFAULT NULL, `trans_currency_base` int(10) unsigned DEFAULT NULL, `trans_currency_counter` int(10) unsigned DEFAULT NULL, `trans_tid` varchar(20) DEFAULT NULL, `trans_type` tinyint(4) DEFAULT NULL, `trans_price` decimal(15,4) DEFAULT NULL, `trans_amount` decimal(15,8) DEFAULT NULL, `trans_datetime` datetime DEFAULT NULL, `trans_sid` bigint(20) DEFAULT NULL, `trans_timestamp` int(10) unsigned DEFAULT NULL, PRIMARY KEY (`trans_id`), KEY `trans_tid` (`trans_tid`), KEY `trans_datetime` (`trans_datetime`), KEY `trans_timestmp` (`trans_timestamp`), KEY `trans_price` (`trans_price`), KEY `trans_amount` (`trans_amount`) ) ENGINE=MyISAM AUTO_INCREMENT=6162559 DEFAULT CHARSET=utf8;
As you can see from the AUTO_INCREMENT value, the table has over 6 million entries. There will eventually be many more.
I would like to query the table to obtain max price, min price, volume and total amount traded during arbitrary time intervals. To accomplish this, I'm using a query like this:
SELECT DATE_FORMAT( MIN(transactions.trans_datetime), '%Y/%m/%d %H:%i:00' ) AS trans_datetime, SUM(transactions.trans_amount) as trans_volume, MAX(transactions.trans_price) as trans_max_price, MIN(transactions.trans_price) as trans_min_price, COUNT(transactions.trans_id) AS trans_count FROM transactions WHERE transactions.trans_datetime BETWEEN '2014-09-14 00:00:00' AND '2015-09-13 23:59:00' GROUP BY transactions.trans_timestamp DIV 86400
That should select transactions made over a year period, grouped by day (86,400 seconds).
The idea is the timestamp field, which contains the same value as datetime, but as a timestamp...I found this faster than UNIX_TIMESTAMP(trans_datetime), is divided by the amount of seconds I want to be in the time intervals.
The problem: the query is slow. I'm getting 4+ seconds processing time. Here is the result of EXPLAIN:
id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE transactions ALL trans_datetime,trans_timestmp NULL NULL NULL 6162558 Using where; Using temporary; Using filesort
The question: is it possible to optimize this better? Is this structure or approach flawed? I have tried several approaches, and have only succeeded in making modest millisecond-type gains.