4
votes

I used BigQuery about analytics. I written this Query.

[setup_chr_lvs]'s schema is

userId is STRING,

lv is STRIMG,

chrLvs is STRING

"[1, 25, 344]" (for example)

SELECT userId,
   lv,
   INTEGER(SPLIT( REGEXP_REPLACE( REGEXP_REPLACE(chrLvs, r'\[', ''), r'\]', ''), ",")) AS chrLv
   FROM [kpi.setup_chr_lvs]

This schema is successful. But can't save view.

Failed to create view. Cannot create valid output schema for field chrLV. Try renaming chrLv to f0_group.chrLv in the outermost SELECT.

Why?

2

2 Answers

2
votes

First, the error message's crazy-looking suggestion can actually work -- instead of your AS chrLv, have AS f0_group.chrLv. I won't even try to explain exactly why, because I can't really, but I can give some background on when this thing happens and how to work around it generally.

The key thing here is the SPLIT. SPLIT creates repeated values, which are where a single row and single column contains multiple values. When you SPLIT "a,b,c" as val, the immediate result is a single row whose schema would say val: string (repeated).

When you run a query using a SPLIT, by default, the query results are flattened: a single row containing a 3-repeated value will turn into 3 rows. When you try to Save View, the behavior is different: it doesn't flatten. That's why you can query but not save the view. (If you want, you can reproduce the unflattened behavior by setting query options: uncheck "Flatten Results" (also "Allow Large" and "Select Table").)

So why the funny error with the unflattened result? This happens when you apply a function to a repeated value. The SPLIT creates a repeated output, and the INTEGER applies to that, triggering a surprise in the schema of the expression.

The workaround is to remove repeatedness -- flatten it to multiple rows -- before applying that function. First SPLIT, then FLATTEN, then the INTEGER. Like this:

SELECT
  userId,
  lv,
  INTEGER(chrLvStr) as chrLv
FROM FLATTEN(
  (
  SELECT 
    userId,
    lv,
    SPLIT( REGEXP_REPLACE( REGEXP_REPLACE(chrLvs, r'\[', ''), r'\]', ''), ",") AS chrLvStr
  FROM [test.setup_chr_lvs]
  ), 
  chrLvStr
)

This runs (and can be saved as a view) where [test.setup_chr_lvs] is a one-row model of your data saved from this query:

select "user1" as userId, "lv1" as lv, "[1, 24, 344]" as chrLvs

(And no, I do not have an indentation convention I use for FLATTEN.)

0
votes

Not knowing the internals of BigQuery, I would suppose this is because at some level, your chrLvs are broken down into partial attributes called "chrLv" and that confuses the system.

Did you try renaming your "as" to what was suggested?