2
votes

I'm using big query and i want to create a job which populates a table with a "record" type columns. The data will be populated by a query - Is it possible to create a table with two columns type record?

like the table [bigquery-public-data:samples.trigrams] in BG Public Datasets

Thanks!

2

2 Answers

2
votes

The easiest way to control the output of a query as a record, is to use a JavaScript UDF.

For example:

SELECT *
FROM js(
(
  SELECT item
  FROM [fh-bigquery:wikidata.latest_raw] 
),
item,
"[{name: 'id', type:'string'},
  {name: 'sitelinks', type:'record', mode:'repeated', fields: [{name: 'site', type: 'string'},{name: 'title', type: 'string'},{name: 'encoded', type: 'string'}]},
  ]",
  "function(r, emit) {
    [...]
emit({
    id: obj.id,
    sitelinks: sitelinks,
    });  
  }")

See the complete example at https://github.com/fhoffa/code_snippets/blob/master/wikidata/create_wiki_en_table.sql.

0
votes

With introduction of BigQuery Standard SQL we've got easy way to deal with records
Try below, Don't forget to uncheck Use Legacy SQL checkbox under Show Options

WITH YourTable AS (
  SELECT 1 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z UNION ALL
  SELECT 1 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z UNION ALL
  SELECT 2 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z UNION ALL
  SELECT 2 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z UNION ALL
  SELECT 3 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z UNION ALL
  SELECT 3 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z UNION ALL
  SELECT 3 AS a, 2 AS b, 3 AS c, 13 AS x, 12 AS y, 13 AS z
)
SELECT 
  a, ARRAY_AGG(STRUCT(b, c)) AS aa, 
  x, ARRAY_AGG(STRUCT(y, z)) AS xx
FROM YourTable
GROUP BY a, x

Similar result in BigQuery Legacy SQL can be done by below code:

SELECT *
FROM JS( 
  ( // input table 
  SELECT 
    a, GROUP_CONCAT(CONCAT(STRING(b), ';', STRING(c))) AS aa, 
    x, GROUP_CONCAT(CONCAT(STRING(y), ';', STRING(z))) AS xx
  FROM 
    (SELECT 1 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z),
    (SELECT 1 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z),
    (SELECT 2 AS a, 2 AS b, 3 AS c, 11 AS x, 12 AS y, 13 AS z),
    (SELECT 2 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z),
    (SELECT 3 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z),
    (SELECT 3 AS a, 2 AS b, 3 AS c, 12 AS x, 12 AS y, 13 AS z),
    (SELECT 3 AS a, 2 AS b, 3 AS c, 13 AS x, 12 AS y, 13 AS z)
  GROUP BY a,x
  ), 
  a, aa, x, xx, // input columns 
  "[ // output schema 
  {name: 'a', type:'integer'},
  {name: 'aa', type:'record', mode:'repeated', 
  fields: [
    {name: 'b', type: 'integer'},
    {name: 'c', type: 'integer'}
    ]},
  {name: 'x', type:'integer'},
  {name: 'xx', type:'record', mode:'repeated', 
  fields: [
    {name: 'y', type: 'integer'},
    {name: 'z', type: 'integer'}
    ]}
   ]", 
  "function(row, emit) { // function 
    var aa = []; 
    aa1 = row.aa.split(',');
    for (var i = 0; i < aa1.length; i++) { 
      aa2 = aa1[i].split(';');
      aa.push({b:parseInt(aa2[0]), c:parseInt(aa2[1])}); 
    }; 
    var xx = []; 
    xx1 = row.xx.split(',');
    for (var i = 0; i < aa1.length; i++) { 
      xx2 = xx1[i].split(';');
      xx.push({y:parseInt(xx2[0]), z:parseInt(xx2[1])}); 
    };
    emit({
      a: row.a, 
      aa: aa, 
      x: row.x,
      xx: xx
      }); 
  }"
)  

For this to work (for Legacy SQL) you need to set destination table and check Allow Large Results checkbox and unckeck Flatten Results checkbox (all under Show Options)