14
votes

I have this query which works correctly in MySQL. More background on it here.

SELECT c.*, SUM(ABS(v.vote)) AS score
FROM categories c,items i, votes v
    WHERE c.id = i.category_id
    AND i.id = v.voteable_id
    AND v.created_at > '#{1.week.ago}'
GROUP BY c.id
ORDER BY score DESC LIMIT 8;

I tried running it in PostgreSQL, and it failed with this error message.

PGError: ERROR: column "c.name" must appear in the GROUP BY clause or be used in an aggregate function

I wasn't sure what this meant, so I tried changing "c.id" to "c.name" in the group by clause (both work in MySQL the same, assuming the name of an item is unique).

However this just produced another similar error

PGError: ERROR: column "c.id" must appear in the GROUP BY clause or be used in an aggregate function

How can this problem be resolved?

3
Oddly, it seems that this is now allowed in Postgre 9.1you786
Was it removed in Postgre 9.2? I'm getting this error for the same reasons.Aaron
@Aaron: it was not removed but works only when (quoting the documentation): the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped columnDaniel Vérité

3 Answers

11
votes

You have to list column names in SELECT which you are grouping in:

SELECT c.id, c.name, SUM(ABS(v.vote)) AS score
FROM categories c,items i, votes v
  WHERE c.id = i.category_id
  AND i.id = v.voteable_id
  AND v.created_at > '#{1.week.ago}'
GROUP BY c.id, c.name
ORDER BY score DESC LIMIT 8;

"It is not permissible to include column names in a SELECT clause that are not referenced in the GROUP BY clause."

9
votes

I just had that issue but going from MySQL to SQL Server. I thought the fact that it is allowed it was strange!

Yes, in most databases, when you have a GROUP BY clause you can only select aggregates of columns or columns that appear in the GROUP BY clause. This is because it has no way of knowing if the other columns you're selecting are truly unique or not.

Just put the columns you want in the GROUP BY if they are indeed unique. This was a "feature" of MySQL that was questionable.

You can read about MySQL's behavior and how it is different here.

Example:

SELECT c.*, SUM(ABS(v.vote)) AS score
FROM categories c,items i, votes v
    WHERE c.id = i.category_id
    AND i.id = v.voteable_id
    AND v.created_at > '#{1.week.ago}'
GROUP BY c.id, c.name, c.whatever_else
ORDER BY score DESC LIMIT 8;
2
votes

If you change your statement to this it should work:

SELECT c.id, SUM(ABS(v.vote)) AS score
FROM categories c,items i, votes v
  WHERE c.id = i.category_id
  AND i.id = v.voteable_id
  AND v.created_at > '#{1.week.ago}'
GROUP BY c.id
ORDER BY score DESC LIMIT 8;

I'm not sure what MySQL gives as a result, but to give you a very small example of why this does not work in PostgreSQL, take a look at the following categories table:

id | name
---|-----
 1 | ABC
 1 | DEF

You group by id, so each row in the result should contain just one id. If you select name also, without grouping by it, what should be shown in the result for name?

It can be either ABC or DEF but the database engine can not really decide that for you (although MySQL apparently does).