0
votes

I'm trying to pass a nested RECORD to my passthrough UDF function which performs some actions on logMessage and then returns a string. However I'm unable to find the correct leaf that contains the logMessage. I couldn't find an example that deals with multiple level nesting. Do I need to do something else with the nested record to be able to access the logMessage string 2 levels deep? I suspect the answer must be pretty straightforward, but since my query is executing, but just returning "null" for each record as a result (probably because I'm emitting a nonexistent leaf or I'm missing some logic), I don't really know how to debug this.

DATA Schema:

[{"name":"proto","mode":"repeated","type":"RECORD",
    "fields":
    [
        {"name":"line","mode":"repeated","type":"RECORD",
            "fields":
            [
                {"name": "logMessage","type": "STRING"}
            ]
        }
    ]
}]

Here's my SQL:

SELECT
url
FROM (passthrough(
SELECT
  proto.line.logMessage
FROM
  [mydata]
))

My UDF (I'm emitting the value right back at the moment - returns "null" for each record):

function passthrough(row, emit) {
  emit({url: row.proto.line.logMessage}); 
}

bigquery.defineFunction(
  'passthrough',
  ['proto.line.logMessage'],
  [{'name': 'url', 'type': 'string'}],
  passthrough
);
2
Important on SO - you can mark accepted answer by using the tick on the left of the posted answer, below the voting. See meta.stackexchange.com/questions/5234/… for why it is important! Also good practice is to vote on answer. Vote up answers that are helpful. There are more ... You can check about what to do when someone answers your question - stackoverflow.com/help/someone-answers. - Mikhail Berlyant

2 Answers

0
votes

You're using repeated records, and repeated fields are represented as arrays in JS. So you probably need something like this:

function passthrough(row, emit) {
  emit({url: row.proto[0].line[0].logMessage}); 
}

If you want to debug your UDF outside BigQuery, try using this test tool:

http://storage.googleapis.com/bigquery-udf-test-tool/testtool.html

You can generate input data for your UDF that matches the exact structure of your data by clicking on the "Preview" button in the BQ web UI and then clicking on "JSON" to get a copy-pastable JSON representation of your data.

0
votes

I think, below pretty much resembles your case:

SELECT body
FROM JS( 
  ( // input table 
  SELECT payload.comment.body
  FROM [publicdata:samples.github_nested]
  WHERE actor = 'shiftkey'
  AND payload.comment.body IS NOT NULL
  ), 
  payload.comment.body, // input columns 
  "[ // output schema 
     {'name': 'body', 'type': 'STRING'
     }
  ]", 
  "function(row, emit) { // function 
    emit({body: row.payload.comment.body}); 
  }"
)