I am using this Dataflow template "Pub/Sub Topic to BigQuery" to parse json schema with RECORD type data structure. Sample Example :
{
"url":"/i?session_duration=61&app_key=123456&device_id=gdfttyty&sdk_name=javascript_native_web&sdk_version=18.04",
"body":
{
"session_duration":"61",
"app_key":"eyrttyuyyu78jkjk",
"device_id":"h1bh41yptik1vtwr8",
"sdk_name":"javascript_native_web",
"sdk_version":"18.04",
"timestamp":"1597057884636",
"hour":"10",
"dow":"1"
},
"app_key":"eyrttyuyyu78jkjk",
"timestamp":"1597057884636",
"ip_address":"0.0.0.0"
}
Schema Defined in BigQuery is as :
[
{
"name":"url",
"type":"STRING",
"mode":"NULLABLE"
},
{
"name":"body",
"type":"RECORD",
"mode":"REPEATED",
"fields":[
{
"name":"session_duration",
"type":"STRING",
"mode":"NULLABLE"
},
{
"name":"app_key",
"type":"STRING",
"mode":"NULLABLE"
},
{
"name":"device_id",
"type":"STRING",
"mode":"NULLABLE"
},
{
"name":"sdk_name",
"type":"STRING",
"mode":"NULLABLE"
},
{
"name":"sdk_version",
"type":"STRING",
"mode":"NULLABLE"
},
{
"name":"timestamp",
"type":"TIMESTAMP",
"mode":"NULLABLE"
},
{
"name":"hour",
"type":"TIME",
"mode":"NULLABLE"
},
{
"name":"dow",
"type":"STRING",
"mode":"NULLABLE"
}
]
},
{
"name":"app_key",
"type":"STRING",
"mode":"NULLABLE"
},
{
"name":"timestamp",
"type":"STRING",
"mode":"NULLABLE"
},
{
"name":"ip_address",
"type":"STRING",
"mode":"NULLABLE"
}
]
Error Message:
{"errors":[{"debugInfo":"","location":"","message":"Repeated record added outside of an array.","reason":"invalid"}],"index":0}
If I parse data without RECORD type , it gets parsed correctly and in appropriate bigquery table but with RECORD type it gets ingested to bq generated <error_records> table.