0
votes

I am trying to do a group by on this table so that it will look like the following below.

Before

 Student  Fruit 1  Fruit 2
 Tom      Apple    Banana
 Tom      Banana   Apple
 Gary     Apple    Banana

After

 Student  Fruit 1  Fruit 2  Count  Repeated Condition 
 Tom      Apple    Banana   2      True
 Gary     Apple    Banana   1      False

Anyone knows how can this be easily achieved on Google BigQuery? Order does not matter. I have tried String_AGG and Concat functions but the order matters.

Trying to create a function that attempts to create the second table.

2
How is Repeated Condition generated? - Nick
if count >1 then true else false - Jervis Chionh
And what does Count count? - Nick
the rows of tom in which he had consumed apple and bananas with a different order - Jervis Chionh

2 Answers

2
votes

You can use LEAST and GREATEST to sort the fruits in the two columns into alphabetical order, and then group on those sorted values:

SELECT Student,
       LEAST(Fruit1, Fruit2) AS Fruit1,
       GREATEST(Fruit1, Fruit2) AS Fruit2,
       COUNT(*) AS Count,
       CASE WHEN COUNT(*) > 1 THEN 'True' ELSE 'False' END AS "Repeated Condition"
FROM fruits
GROUP BY Student, LEAST(Fruit1, Fruit2), GREATEST(Fruit1, Fruit2)

Output:

student     fruit1  fruit2  count   Repeated Condition
Tom         Apple   Banana  2       True
Gary        Apple   Banana  1       False
1
votes

Here is one option, which uses a least/greatest trick to get the reporting you want:

SELECT
    Student,
    LEAST(fruit1, fruit2) AS fruit1,
    GREATEST(fruit1, fruit2) AS fruit2,
    COUNT(*) AS cnt,
    CASE WHEN LEAST(fruit1) <> GREATEST(fruit1)
         THEN 'True' ELSE 'False' END AS "Repeated Condition"
FROM yourTable
GROUP BY
    Student,
    LEAST(fruit1, fruit2),
    GREATEST(fruit1, fruit2)

The idea here is to aggregate by student along with the smaller/larger values of any pair of two fruits. This brings (Apple, Banana) into line with (Banana, Apple). Then, we report True for the repeated condition if, in a given student/fruit group, we can detect that the order of fruits was ever not the same everywhere. Otherwise, we report False.