1
votes

I'm getting an "Error: Unexpected. Please try again" message in BigQuery when I try to run a query. It's happening repeatedly.

I have two nested subqueries and the middle query uses several count(distinct x, ), so I wonder if that's the problem. If I remove the outer query (but still have all the count distincts, it does work fine though.

2
In case someone from Google sees this, my job id is realself-main:job_4CBlFODR8i4D7UAOzF1F1OPqP8I.alan

2 Answers

4
votes

This error arises from a quirk in the implementation of COUNT(DISTINCT) that will hopefully be improved soon. For now, if you use multiple COUNT(DISTINCT) together with GROUP BY in a sub-select, you need to use all or none of the the COUNT(DISTINCT) output in the outer SELECT.

If this is troublesome, it may be easier to replace the COUNT(DISTINCT) with GROUP BY and SUM(). For example, this query similar to your own will fail if either of the outer fields are removed:

SELECT Cnt_a, Cnt_b
FROM (SELECT COUNT(DISTINCT IF(a > 0, c, NULL)) AS Cnt_a,
             COUNT(DISTINCT IF(b > 0, c, NULL)) AS Cnt_b
      FROM (SELECT 0 AS a, 1 AS b, 2 AS c, 3 as d)
      GROUP BY d);

But an equivalent query that groups by the distinct field will execute regardless:

SELECT SUM(NonZero_a) AS Cnt_a, SUM(NonZero_b) AS Cnt_b
FROM (SELECT SUM(a) > 0 AS NonZero_a,
             SUM(b) > 0 AS NonZero_b
      FROM (SELECT 0 AS a, 1 AS b, 2 AS c, 3 as d)
      GROUP BY c, d);
0
votes

I looked up the internal error, and it seems to be an error we should have made external:

"Scoped Aggregation must consume the most repeated field from every independently-repeating branch. Unused fields that must be consumed: f6_, f5_, f4_, f3_, f2_, f1_"

Let me know if that is not enough to debug the query and I can add more explanation about the error.