4
votes

Since two days ago (August 10th 2016), a query which used to work (using tables of the BQ Export for Google Analytics Premium) has stopped working. It returns the following error:

Error: Cannot union tables : Incompatible types. 'hits.latencyTracking.userTimingVariable' : TYPE_INT64 'hits.latencyTracking.userTimingVariable' : TYPE_STRING

After some investigation, it seems to be a problem with using IN in a WHERE clause when I query tables from before and after August 10th (table ga_sessions_20160810).

I've simplified my original query to provide a dummy one which has the same basic structure. The following query works (querying data from 2016-08-08 and 2016-08-09):

SELECT fullVisitorId, sum(totals.visits)
FROM (select * from TABLE_DATE_RANGE([XXXXXXXX.ga_sessions_],TIMESTAMP('2016-08-08'),TIMESTAMP('2016-08-09')))
WHERE fullVisitorId in(
    SELECT fullVisitorId
    FROM TABLE_DATE_RANGE([XXXXXXXX.ga_sessions_],TIMESTAMP('2016-08-08'),TIMESTAMP('2016-08-09'))
)
GROUP BY fullVisitorId

But this other one (just changing dates, in this case from 2016-08-09 and 2016-08-10) returns the error:

SELECT fullVisitorId, sum(totals.visits)
FROM (select * from TABLE_DATE_RANGE([XXXXXXXX.ga_sessions_],TIMESTAMP('2016-08-09'),TIMESTAMP('2016-08-10')))
WHERE fullVisitorId in(
    SELECT fullVisitorId
    FROM TABLE_DATE_RANGE([XXXXXXXX.ga_sessions_],TIMESTAMP('2016-08-09'),TIMESTAMP('2016-08-10'))
)
GROUP BY fullVisitorId

This last query works fine either if I delete the WHERE clause or if I just try the query within the IN, so I guess the problem is with the structure WHERE field IN(...). Furthermore, querying only data from 2016-08-10 does work. Also, the same happens using a field different to fullVisitorId and running the same queries in different BQ projects.

Looking to the error description, it should be a problem with variable types, but I don't know what is hits.latencyTracking.userTimingVariable. My query used to work properly, so I can't figure out what has changed that produces the error. Have some fields changed their type or what happened?

Has anyone experienced this? Is this a bug or a new behavior in BigQuery? How can this error be solved?

2
The schema changes from one day to another, if you check the table ga_sessions the parameter hits.latencyTracking.userTimingVariable is a integer and now it's a string, you cant even try to do this: SELECT fullVisitorId , hits.latencyTracking.userTimingVariable FROM TABLE_DATE_RANGE([xxxxxxxxxxxx.ga_sessions_],TIMESTAMP('2016-08-9'),TIMESTAMP('2016-08-10')) , the date_range function is not prepared to a schema change beetween daysKemen Paulos Plaza
You're right, Paulo. hits.latencyTracking.userTimingVariable has changed from INTEGER to STRING during this period, but I'm not even querying this field... Do you know in what way my query is using it?Pol Ferrando

2 Answers

2
votes

As you are using * in select clause it might causing problem when union is happening its trying to combine two different column types ( as schema changed from INT64 to STRING).

I have two approaches

1) use only those fields required by you than using * in select clause

SELECT fullVisitorId, sum(totals.visits)
 FROM (select fullVisitorId,totals.visits from TABLE_DATE_RANGE([XXXXXXXX.ga_sessions_],TIMESTAMP('2016-08-09'),TIMESTAMP('2016-08-10')))
  WHERE fullVisitorId in(
   SELECT fullVisitorId
    FROM TABLE_DATE_RANGE([XXXXXXXX.ga_sessions_],TIMESTAMP('2016-08-09'),TIMESTAMP('2016-08-10'))
  ) GROUP BY fullVisitorId

2) using views to split inner queries and use the view later in the query. (even in view you need to use only use those fields which are required )

SELECT fullVisitorId, sum(totals.visits)
 FROM [view.innertable2]
  WHERE fullVisitorId in(
    SELECT fullVisitorId from [view.innertable1] ) GROUP BY fullVisitorId

This will exclude the hits.latencyTracking.userTimingVariable so there will be no error.

0
votes

If the fields that you are querying are compatible, you may try using Standard SQL wildcard tables (you'll have to uncheck use Legacy SQL box if you are doing this from the UI). Something like this:

SELECT fullVisitorId, sum(totals.visits) 
FROM `xxxxxxxx.ga_sessions_*`
WHERE _TABLE_SUFFIX BETWEEN '20160808' and '20160810'
GROUP BY fullVisitorId;