1
votes

I have a problem with a GROUP_CONCAT select which should also have the row numbering included similar to this question GROUP_CONCAT numbering the difference is that i have to group by multiple columns.

As an example I have 2 tables review and review_detail.
Schema (MySQL v5.5)

create table review (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `submission_id` int(11) NOT NULL,
   PRIMARY KEY (`id`)
);

create table review_detail (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `review_id` int(11),
  `category_id` int(11),
  `rating` varchar(100),
  PRIMARY KEY (`id`)
);

insert into review (`id`, `submission_id`) values (1, 1), (2, 1), (3, 2), (4, 3), (5,1), (6,3), (7,2), (8,3);

insert into review_detail (`review_id`, `category_id`, `rating`)
values 
(1, 1, ' submission 1.1 cat 1'), (1, 2, ' submission 1.1 cat 2'),
(2, 1, ' submission 1.2 cat 1'), (2, 2, ' submission 1.2 cat 2'),
(3, 1, ' submission 2.1 cat 1'), (3, 2, ' submission 2.1 cat 2'),
(4, 1, ' submission 3.1 cat 1'), (4, 2, ' submission 3.1 cat 1'),
(5, 1, ' submission 1.3 cat 1'), (5, 2, ' submission 1.3 cat 2'),
(6, 1, ' submission 3.2 cat 1'), (6, 2, ' submission 3.2 cat 2'),
(7, 1, ' submission 2.2 cat 1'), (7, 2, ' submission 2.2 cat 2'),
(8, 1, ' submission 3.3 cat 1'), (6, 2, ' submission 3.3 cat 2')
;

Query #1

SELECT * FROM review;

| id  | submission_id |
| --- | ------------- |
| 1   | 1             |
| 2   | 1             |
| 3   | 2             |
| 4   | 3             |
| 5   | 1             |
| 6   | 3             |
| 7   | 2             |
| 8   | 3             |

Query #2

SELECT * FROM review_detail;

| id  | review_id | category_id | rating                |
| --- | --------- | ----------- | --------------------- |
| 1   | 1         | 1           |  submission 1.1 cat 1 |
| 2   | 1         | 2           |  submission 1.1 cat 2 |
| 3   | 2         | 1           |  submission 1.2 cat 1 |
| 4   | 2         | 2           |  submission 1.2 cat 2 |
| 5   | 3         | 1           |  submission 2.1 cat 1 |
| 6   | 3         | 2           |  submission 2.1 cat 2 |
| 7   | 4         | 1           |  submission 3.1 cat 1 |
| 8   | 4         | 2           |  submission 3.1 cat 1 |
| 9   | 5         | 1           |  submission 1.3 cat 1 |
| 10  | 5         | 2           |  submission 1.3 cat 2 |
| 11  | 6         | 1           |  submission 3.2 cat 1 |
| 12  | 6         | 2           |  submission 3.2 cat 2 |
| 13  | 7         | 1           |  submission 2.2 cat 1 |
| 14  | 7         | 2           |  submission 2.2 cat 2 |
| 15  | 8         | 1           |  submission 3.3 cat 1 |
| 16  | 6         | 2           |  submission 3.3 cat 2 |

Every review for a submission (foreign key = submission_id) have multiple review_detail entries with category_id (in my example only 2 categories (1,2) which are not relevant for the query).

I have to create a select where i get the GROUP_CONCAT grouped by submission_id and category_id.

The Concat string should return
Reviewer 1: {rating}, Reviewer 2: {rating}, Reviewer 3: {rating} etc..

e.g. for submission_id = 1 and category_id = 1 the group concat should return
Reviewer 1: submission 1.1 cat 1, Reviewer 2: submission 1.2 cat 1, Reviewer 3: submission 1.3 cat 1.

But i couldn't get the numbering in the group concat correct.

I have done multiple tests so far.

Group with only one column counter (works):
https://www.db-fiddle.com/f/6hA4Vft1mQGdw2Pew2An2T/3
Reviewer 1: submission 1.1 cat 1 of review 1 / Reviewer 2: submission 3.3 cat 1 of review 8 / Reviewer 3: submission 2.2 cat 1 of review 7 / Reviewer 4: submission 3.2 cat 1 of review 6 / ... etc.

SELECT
    --review.submission_id,
    review_detail.category_id,
    @i,
    GROUP_CONCAT(
        CONCAT(
            'Reviewer ',
            @i := @i + 1,
            ': ',
            rating,
            ' of review ',  review_id
        )
    SEPARATOR ' / '
    ) concatText,
    @i := 0
FROM
    review_detail
LEFT JOIN review ON review.id = review_detail.review_id,
    (
SELECT
    @i := 0
) init
GROUP BY
    review_detail.category_id
ORDER BY
    review_detail.category_id ASC
;

Test with if and a compare against a string of the 2 grouped columns (doesn't work):
https://www.db-fiddle.com/f/3woAVSw5hrav15jAmuWVdT/3
Reviewer 1: submission 1.1 cat 1 of review 1 / Reviewer 1: submission 1.2 cat 1 of review 2 / Reviewer 1: submission 1.3 cat 1 of review 5

SELECT
    submission_id,
    category_id,
    @i,
    @grp,
    CONCAT_WS("-", submission_id, category_id) AS catgroup,
    GROUP_CONCAT(
        CONCAT(
            'Reviewer ',
            @i := IF(
                @grp = CONCAT_WS("-", submission_id, category_id),
                @i + 1,
                IF(
                    @grp := CONCAT_WS("-", submission_id, category_id),
                    1,
                    1
                )
            ),
            ': ',
            rating,
            ' of review ',  review_id
        )
    ORDER BY review_id, submission_id, category_id 
    SEPARATOR ' / '
    ) concatText
FROM
    review_detail
LEFT JOIN review ON review.id = review_detail.review_id,
    (
SELECT
    @i := 0,
    @grp := ''
) init
GROUP BY
    review.submission_id,
    review_detail.category_id

So does anyone know a way to get the numbering in a GROUP_CONCAT call correct when multiple columns are grouped by?

4
Upgrade to 8.0 or MariaDB 10.2 so you can get ROW_NUMBER(). - Rick James
thank you all for your solutions. Every solution mentioned below works therefore its hard for me to give the bounty to a specific solution. I hope it doesn't annoy you if i picked some other solution. I am really appreciate all the solutions from below. - BHoft

4 Answers

1
votes

You should avoid using user-defined variables like that in production code.

In the manual for MySQL 5.6 it says:

As a general rule, other than in SET statements, you should never assign a value to a user variable and read the value within the same statement.

And even in the documentation for 8.0 it states:

The order of evaluation for expressions involving user variables is undefined. For example, there is no guarantee that SELECT @a, @a:=@a+1 evaluates @a first and then performs the assignment.

In future releases this might not work anymore altogether:

Previous releases of MySQL made it possible to assign a value to a user variable in statements other than SET. This functionality is supported in MySQL 8.0 for backward compatibility but is subject to removal in a future release of MySQL.

So here's a solution without user-defined variables:

SELECT 
r.submission_id,
rd.category_id,
GROUP_CONCAT(CONCAT('Reviewer ', (SELECT COUNT(*) + 1 
                                  FROM review 
                                  JOIN review_detail ON review.id = review_detail.review_id 
                                  WHERE r.submission_id = review.submission_id 
                                  AND review_detail.category_id = rd.category_id 
                                  AND review_detail.id < rd.id
                                 ), ': ', rating, ' of review ', review_id) ORDER BY rating SEPARATOR ' / ') AS shorter_column_name
FROM 
review r 
JOIN review_detail rd ON rd.review_id = r.id
GROUP BY r.submission_id, rd.category_id;

which returns

+---------------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| submission_id | category_id | shorter_column_name                                                                                                                           |
+---------------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
|             1 |           1 | Reviewer 1:  submission 1.1 cat 1 of review 1 / Reviewer 2:  submission 1.2 cat 1 of review 2 / Reviewer 3:  submission 1.3 cat 1 of review 5 |
|             1 |           2 | Reviewer 1:  submission 1.1 cat 2 of review 1 / Reviewer 2:  submission 1.2 cat 2 of review 2 / Reviewer 3:  submission 1.3 cat 2 of review 5 |
|             2 |           1 | Reviewer 1:  submission 2.1 cat 1 of review 3 / Reviewer 2:  submission 2.2 cat 1 of review 7                                                 |
|             2 |           2 | Reviewer 1:  submission 2.1 cat 2 of review 3 / Reviewer 2:  submission 2.2 cat 2 of review 7                                                 |
|             3 |           1 | Reviewer 1:  submission 3.1 cat 1 of review 4 / Reviewer 2:  submission 3.2 cat 1 of review 6 / Reviewer 3:  submission 3.3 cat 1 of review 8 |
|             3 |           2 | Reviewer 1:  submission 3.1 cat 1 of review 4 / Reviewer 2:  submission 3.2 cat 2 of review 6 / Reviewer 3:  submission 3.3 cat 2 of review 6 |
+---------------+-------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
1
votes

to fix your query.

The base problem is that tables are by nature unsorted, that is why the MySQL optimizer,removes the ORDER BY.

In MySQL is it enough to put all the tables in the FROM clause ad make a subquery with the order, mysql will kepp it.

In Mariadb this is nt enough You have also add a LIMIT 18446744073709551615 so that the optimizer will keep it

Schema (MySQL v5.5)

Query #1

SELECT
    submission_id,
    category_id,
    @i,
    @grp,
    CONCAT_WS("-", submission_id, category_id) AS catgroup,
    GROUP_CONCAT(
        CONCAT(
            'Reviewer ',
            @i := IF(
                @grp = CONCAT_WS("-", submission_id, category_id),
                @i := @i + 1,
                IF(
                    @grp := CONCAT_WS("-", submission_id, category_id),
                    1,
                    1
                )
            ),
            ': ',
            rating,
            ' of review ',  review_id
        )
    ORDER BY review_id, submission_id, category_id 
    SEPARATOR ' / '
    ) concatText
FROM
    (SELECT review_id, submission_id, category_id,`rating` FROM review_detail
LEFT JOIN review ON review.id = review_detail.review_id
     ORDER BY review_id, submission_id, category_id ) t1,
    (
SELECT
    @i := 0,
    @grp := ''
) init


GROUP BY
    submission_id,
    category_id;

Result

| submission_id | category_id | @i  | @grp | catgroup | concatText                                                                                                                                    |
| ------------- | ----------- | --- | ---- | -------- | --------------------------------------------------------------------------------------------------------------------------------------------- |
| 1             | 1           | 0   |      | 1-1      | Reviewer 3:  submission 1.1 cat 1 of review 1 / Reviewer 2:  submission 1.2 cat 1 of review 2 / Reviewer 1:  submission 1.3 cat 1 of review 5 |
| 1             | 2           | 3   | 1-1  | 1-2      | Reviewer 3:  submission 1.1 cat 2 of review 1 / Reviewer 2:  submission 1.2 cat 2 of review 2 / Reviewer 1:  submission 1.3 cat 2 of review 5 |
| 2             | 1           | 3   | 1-2  | 2-1      | Reviewer 1:  submission 2.1 cat 1 of review 3 / Reviewer 2:  submission 2.2 cat 1 of review 7                                                 |
| 2             | 2           | 2   | 2-1  | 2-2      | Reviewer 2:  submission 2.1 cat 2 of review 3 / Reviewer 1:  submission 2.2 cat 2 of review 7                                                 |
| 3             | 1           | 2   | 2-2  | 3-1      | Reviewer 2:  submission 3.1 cat 1 of review 4 / Reviewer 1:  submission 3.2 cat 1 of review 6 / Reviewer 3:  submission 3.3 cat 1 of review 8 |
| 3             | 2           | 3   | 3-1  | 3-2      | Reviewer 3:  submission 3.1 cat 1 of review 4 / Reviewer 2:  submission 3.3 cat 2 of review 6 / Reviewer 1:  submission 3.2 cat 2 of review 6 |

View on DB Fiddle

1
votes

You need to use tow-step subquery to sort by reviewer number.

SET @i := 0;
SET @grp := '';
SELECT
    submission_id,
    category_id,
    GROUP_CONCAT(
      CONCAT(
        'Reviewer ',
        i,
        ': ',
        rating,
        ' of review ',  review_id
      )
      ORDER BY i
      SEPARATOR ' / '
    ) concatText
FROM
-- second, add numbering
(
  SELECT *,
    @i := IF(
      @grp = @grp := CONCAT_WS('-',submission_id,category_id),
      @i + 1, 1) i
  FROM
  -- first, sort for numbering
  (
    SELECT
        review_id,
        submission_id,
        category_id,
        rating
    FROM review_detail LEFT JOIN review ON review.id = review_detail.review_id
    ORDER BY
        submission_id,
        category_id,
        review_id
  ) t1
) t2
GROUP BY
    submission_id,
    category_id
;

db fiddle

0
votes

For completeness I also add the solution how this could be done in Mysql 8.0

It works both with COUNT(*)

with base as (
    
  SELECT
    review_id,
    submission_id,
    category_id,
    rating,
    count(*) over (partition by submission_id,category_id  order by review_id) num
  
    FROM review_detail LEFT JOIN review ON review.id = review_detail.review_id
    ORDER BY
        submission_id,
        category_id,
        review_id
)
select   
  submission_id,
         category_id,
         group_concat(concat('Reviewer', num, ': ', rating, ' of review ',  review_id ) separator ', ') concattext
from     base
group by 
submission_id,
category_id
;

OR ROW_NUMBER()

with base as (
        SELECT
            review_id,
            submission_id,
            category_id,
            rating,
            ROW_NUMBER() over (partition by submission_id,category_id  order by review_id) num
        FROM review_detail 
        LEFT JOIN review ON review.id = review_detail.review_id
        ORDER BY
            submission_id,
            category_id,
            review_id
    )
    SELECT   
        submission_id,
        category_id,
        group_concat(concat('Reviewer', num, ': ', rating, ' of review ',  review_id  ) separator ', ') concattext
    from base
    group by 
        submission_id,
        category_id
;

DB Fiddle