You should NOT disable strict
or remove ONLY_FULL_GROUP_BY
. The problem is that your query is ambiguous. That may not make a difference to your output, or it could cause huge problems. It's better for you to be sure.
A great explanation can be read on Percona (summed up below).
Problem
Consider the following situation:
+----+--------------------+---------+---------------------+
| id | page_url | user_id | ts |
+----+--------------------+---------+---------------------+
| 1 | /index.html | 1 | 2019-04-17 12:21:32 |
| 2 | /index.html | 2 | 2019-04-17 12:21:35 |
| 3 | /news.php | 1 | 2019-04-17 12:22:11 |
| 4 | /store_offers.php | 3 | 2019-04-17 12:22:41 |
| 5 | /store_offers.html | 2 | 2019-04-17 12:23:04 |
| 6 | /faq.html | 1 | 2019-04-17 12:23:22 |
| 7 | /index.html | 3 | 2019-04-17 12:32:25 |
| 8 | /news.php | 2 | 2019-04-17 12:32:38 |
+----+--------------------+---------+---------------------+
Now we want to issue a query to calculate the most visited pages. This is probably what you're used to writing:
SELECT page_url, user_id, COUNT(*) AS visits
FROM web_log
GROUP BY page_url
ORDER BY COUNT(*) DESC;
But look at the results:
+-------------------+---------+--------+
| page_url | user_id | visits |
+-------------------+---------+--------+
| /index.html | 1 | 3 |
| /news.php | 1 | 2 |
| /store_offers.php | 3 | 2 |
| /faq.html | 1 | 1 |
+-------------------+---------+--------+
The query works, but it’s not really correct. It is easily understandable that page_url
is the column of the grouping function, the value we are most interested in and we want to be unique for counting. Also, the visits
column is good, as it’s the counter. But what about user_id
? What does this column represent?
We grouped on the page_url
so the value returned for user_id
is just one of the values in the group. In fact, it was not only user to visit the index.html, as users 2 and 3 visited the page. So what should we make of that value? Is it the first visitor? Is it the last one?
We don’t know the right answer! The user_id
column’s value is a random item of the group!
Solution
You need to consider if you need the values not used in the groupBy()
. If not, then just use a select()
to explicitly name the column you need.
If you do need that column not used in the groupBy()
, use an aggregate function (like ANY_VALUE()
or GROUP_CONCAT()
or MAX()
) as part of a Laravel selectRaw
query. Then you can be sure that your query is giving you what you expect.
So in the above example, you could do:
SELECT page_url, ANY_VALUE(user_id), COUNT(*) AS visits
FROM web_log
GROUP BY page_url
ORDER BY COUNT(*) DESC;
Or in Laravel:
WebLog::selectRaw('page_url', 'ANY_VALUE(user_id)', 'COUNT(*) AS visits')
->groupBy('page_url')
->orderBy('visits')
->get();