1
votes

I have this data in table

column 1    column 2    condition   value
1   ABC 1   1
1   строка 2    0   1
1   строка 3    0   1
1   строка 4    0   1
1   строка 5    0   1
1   строка 6    0   1
2   BCD 1   1
2   строка 2    0   1
2   строка 3    0   1

And I wont to have this Result:

column 1    column 2    sum of value
1   ABC 6
2   BCD 3

This result is grouped by the first column only. The second column shows the values that are on those lines where the condition is 1. In the third column, we get the sum over the column value, not including any conditions. Is it possible to do this without join (because table very large)

1

1 Answers

4
votes

Below is for BigQuery Standard SQL

#standardSQL
SELECT 
  column1, 
  MAX(IF(condition = 1, column2, '')) column2,
  SUM(value)sumOfValue
FROM `project.dataset.table`
GROUP BY column1  

You can test / play with it using example from your question as below

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 column1, 'ABC' column2,  1 condition,  1 value UNION ALL
  SELECT 1,         'строка 2',     0,            1 UNION ALL
  SELECT 1,         'строка 3',     0,            1 UNION ALL
  SELECT 1,         'строка 4',     0,            1 UNION ALL
  SELECT 1,         'строка 5',     0,            1 UNION ALL
  SELECT 1,         'строка 6',     0,            1 UNION ALL
  SELECT 2,         'BCD',          1,            1 UNION ALL
  SELECT 2,         'строка 2',     0,            1 UNION ALL
  SELECT 2,         'строка 3',     0,            1 
)
SELECT 
  column1, 
  MAX(IF(condition = 1, column2, '')) column2,
  SUM(value)sumOfValue
FROM `project.dataset.table`
GROUP BY column1   

The result is as expected (I think)

column1 column2 sumOfValue   
1       ABC     6    
2       BCD     3      

Above solution assumes you have just one row with condition=1 per group of column1. If this is not a case - above will give yo just one (MAX) value from column2. If you need to group all such values - see option below

#standardSQL
WITH `project.dataset.table` AS (
  SELECT 1 column1, 'ABC' column2,  1 condition,  1 value UNION ALL
  SELECT 1,         'строка 2',     0,            1 UNION ALL
  SELECT 1,         'строка 3',     1,            1 UNION ALL
  SELECT 1,         'строка 4',     0,            1 UNION ALL
  SELECT 1,         'строка 5',     1,            1 UNION ALL
  SELECT 1,         'строка 6',     0,            1 UNION ALL
  SELECT 2,         'BCD',          1,            1 UNION ALL
  SELECT 2,         'строка 2',     0,            1 UNION ALL
  SELECT 2,         'строка 3',     0,            1 
)
SELECT 
  column1, 
  REGEXP_REPLACE(STRING_AGG(IF(condition = 1, CONCAT(column2, ','), ''),''), r',$', '') column2, 
  SUM(value)sumOfValue
FROM `project.dataset.table`
GROUP BY column1  

In this case, result will be

column1 column2                 sumOfValue   
1       ABC,строка 3,строка 5   6    
2       BCD                     3