I am looking for a solution to aggregate a Bigquery table with about 100 columns using sum() function. But the following query is not allowed in Bigquery standard SQL.
select sum(*)
from `dataset.Intermediate_Tables.eventCat_dummies`
group by Id
The reason I want to do this kind of aggregation in Bigquery is due to its ability to handle large amount of data. I tried to do the same aggregation in jupyter notebook but it failed everytime. It might be because of the large size of data (7.3 GiB csv file). The code I tried is like the following:
df_type = type_dummies.groupby('Id', sort=False).sum()
Anyone can give any suggestions and/or alternatives that how I can get aggregated data of this large dataset?
UPDATE WITH SAMPLE INPUT AND OUTPUT
Input data
Id col1 col2 col3 col4
1 0 0 0 1
2 0 1 1 1
1 1 0 0 0
4 0 0 0 0
19 0 0 0 0
2 1 1 1 1
Desired output
Id col1_sum col2_sum col3_sum col4_sum
1 1 0 0 1
2 1 2 2 2
4 0 0 0 0
19 0 0 0 0
In my original dataset, there are 100 columns and 40 million rows.