2
votes

I tried to search on stackoverflow, google. I mostly see that we use max. But, on my records, it will be a thousand rows, so it would eat so much time before it get's finished.

Basically, I just want to get the latest record by datetime or unixtime.

The table records, sorted by datetime ASC:

Query: select * from accounts order by datetime ASC;

+----------+--------+---------------------+------------+
| user_id  | status | datetime            | unixtime   |
+----------+--------+---------------------+------------+
| 14254047 |      1 | 2013-03-27 01:25:12 | 1364343912 | 
| 14254270 |      1 | 2013-03-27 09:36:58 | 1364373418 | 
| 14254619 |      1 | 2013-03-27 16:07:04 | 1364396824 | 
| 14254047 |      0 | 2013-05-08 07:20:39 | 1367990439 | 
| 14254270 |      0 | 2013-05-09 04:01:36 | 1368064896 | 
| 14254619 |      0 | 2013-05-09 04:01:36 | 1368064896 | 
+----------+--------+---------------------+------------+

When I grouped them by user_id, the result always show the first record. Not the latest, even adding order by datetime or unixtime DESC

Query: select * from accounts group by user_id order by datetime DESC;

+----------+--------+---------------------+------------+
| user_id  | status | datetime            | unixtime   |
+----------+--------+---------------------+------------+
| 14254047 |      1 | 2013-03-27 01:25:12 | 1364343912 | 
| 14254270 |      1 | 2013-03-27 09:36:58 | 1364373418 | 
| 14254619 |      1 | 2013-03-27 16:07:04 | 1364396824 | 
+----------+--------+---------------------+------------+

The result I wanted are the following:

+----------+--------+---------------------+------------+
| user_id  | status | datetime            | unixtime   |
+----------+--------+---------------------+------------+
| 14254047 |      0 | 2013-05-08 07:20:39 | 1367990439 | 
| 14254270 |      0 | 2013-05-09 04:01:36 | 1368064896 | 
| 14254619 |      0 | 2013-05-09 04:01:36 | 1368064896 | 
+----------+--------+---------------------+------------+

What could be the best approach, without using MAX()?

Regards

3

3 Answers

2
votes

Like in most auto-increment situations the max(ID) is actually a better indicator of the most recent record then max(datetime) is. The reason for this is that the unique auto-increment is a tie-breaker on records with the same time.

Even better is that many out there probably already have an index on the auto-increment column so use this if you can.

The caveats are:

  1. you like accuracy and care about data quality
  2. that you care about the relative time the item was created. If a modification to datetime is possible and desired then you use datetime column.
  3. That you have an auto-increment column (like ID described above) in your accounts table. The select * says you do not so that's too bad. But you could add one!

the query. I called the ID column above accounts.id in this query:

select a.*
from accounts a join
    (select user_id, max(id) as maxid
     from accounts
     group by user_id
    ) asum
  on asum.user_id = a.user_id and a.id = asum.maxid
order by a.id DESC;
0
votes

The proper way to do this is using a join:

select a.*
from accounts a join
     (select user_id, max(datetime) as maxdt
      from accounts
      group by user_id
     ) asum
     on asum.user_id = a.user_id and a.datetime = asum.maxdt
order by datetime DESC;

Your original query uses an extension in MySQL where you can have columns in the select clause that are not in the group by clause (or in aggregation functions). When you do this, MySQL is explicit that the values are not determinate. They may sometimes come from the first row, but this behavior is not guaranteed.

0
votes

I was stucked in the same problem and I got my solution on the solution provided by Oliver Hanappi