1
votes

I need to query for the same aggregate stat over all tables matching some naming scheme:

SELECT
  SUBSTR(CONCAT('20', _TABLE_SUFFIX), 0, 10) AS date,
  CASE
    WHEN ENDS_WITH(_TABLE_SUFFIX, 'mobile') THEN 'mobile'
    ELSE 'desktop'
  END AS client,
  APPROX_QUANTILES(bytesJS,1000)[OFFSET(500)] AS p50
FROM
  `httparchive.runs.20*`
WHERE
  ENDS_WITH(_TABLE_SUFFIX, '_pages')
  OR ENDS_WITH(_TABLE_SUFFIX, '_pages_mobile')
GROUP BY
  1,
  2
ORDER BY
  1,
  2

However, the query is failing with Error: Unrecognized name: bytesJS at [7:20]

There should be about 150 tables that match the query. Here's the schema for one of these tables: https://bigquery.cloud.google.com/table/httparchive:runs.2017_05_15_pages

bytesJS is a valid column name, but it's failing as unrecognized. Same error when substituting different known columns. I'm not sure if this is a direct result of using the table wildcard or what.

I would rewrite this with legacy SQL and see if that works, but my query requires selecting the table metadata.

Is this an incompatibility with the wildcard?

1
Also note that removing the APPROX_QUANTILES line produces valid results, so the error is definitely to do with bytesJS.Rick Viscomi

1 Answers

2
votes

This is a version that works:

#standardSQL
SELECT
  SUBSTR(CONCAT('20', _TABLE_SUFFIX), 0, 10) AS date,
  CASE
    WHEN ENDS_WITH(_TABLE_SUFFIX, 'mobile') THEN 'mobile'
    ELSE 'desktop'
  END AS client,
  APPROX_QUANTILES(bytesJS,1000)[OFFSET(500)] AS p50
FROM
  `httparchive.runs.2017_05_15_pages*`
#WHERE
  #ENDS_WITH(_TABLE_SUFFIX, '_pages')
  #OR ENDS_WITH(_TABLE_SUFFIX, '_pages_mobile')
GROUP BY
  1,
  2
ORDER BY
  1,
  2

What's happening:

  • httparchive.runs.20* encompasses some tables which don't have the bytesJS column, hence that column is not considered for the rest of the query.
  • By the time the query looks for (_TABLE_SUFFIX, '_pages') OR (_TABLE_SUFFIX, '_pages_mobile'), it's too late to change the query assumed schema.

Solution:

  • FROM prefix* table glob won't work here, you'll have to run a classic UNION of tables.