The following query:
select
corpus_date
,sum(count(*)) over (partition by corpus_date) as words_year
,corpus
,count(*) as words
from [publicdata:samples.shakespeare]
group by
corpus_date
,corpus
order by
corpus_date
results in the error message SELECT * cannot be combined with selecting other fields or expressions.
Now I'm wondering whether I'm missing an obvious mistake or whether BigQuery really doesn't allow combining windowed and aggregate functions. I haven't been able to find the answer in the documention. The error message does not yield any search results and in any case seems inappropriate.
(I've used the Shakespeare database here for replication purposes only.)
EDIT: The desired result can obviously be obtained with a subselect, like so:
select
corpus_date
,sum(words) over (partition by corpus_date) as words_year
,corpus
,words
from (select
corpus_date
,corpus
,count(*) as words
from [publicdata:samples.shakespeare]
group by
corpus_date
,corpus
) sub
order by
corpus_date
So I should clarify that my question is simply whether anyone can confirm this unexpected behaviour.
EDIT2: Apparently it's also not possible to directly (i.e. without appealing to another subselect) use the result of a windowed function in a calculation :-( :
select
corpus
,corpus_date
,words
,1.*words/sum(words) over (partition by corpus_date) as perc_of_year
from (select
corpus_date
,corpus
,count(*) as words
from [publicdata:samples.shakespeare]
group by
corpus_date
,corpus
) sub
order by
corpus_date
,corpus