I am trying to insert the below json data into a bigquery table using the Nifi processor PutBigQueryBatch 1.9.2
{
"Revenue_Label": "Dining Room",
"StoreName": "STORE A",
"Revenue_Id": "1",
"Alteration_Flag": "False",
"Order_Mode_Label": "DriveThru",
"checkClosetime": "2019-10-24T13:43:19+13:00",
"Alterations": [ {
"Alteration_Product_Code": "211136",
"Alteration_Product_Net_Amount": 0.0,
"Altered_Product_Code": "211135",
"Alteration_Product_Amount": 0.0,
"Altered_Product_Name": "Burger",
"Alteration_Product_Name": "Add Sauce",
"Alteration_Product_Qty": 1.0 } ],
"StoreId": "1234",
"dob": "20191024",
"Order_Mode_Id": "3",
"checknumber": "54321"}
However, I keep on getting the below error
PutBigQueryBatch:The RECORD field must have at least one sub-field:java.lang.IllegalArgumentException:The RECORD field must have at least one sub-field
Within the properties of the PutBigQueryBatch I have used the below schema definition that matches the BigQuery table definition in order, mode, name and type
[
{"mode": "NULLABLE","name": "Revenue_Label","type": "STRING"},
{"mode": "NULLABLE","name": "StoreName","type": "STRING"},
{"mode": "NULLABLE","name": "Revenue_Id", "type": "STRING" },
{"mode": "NULLABLE","name": "Alteration_Flag","type": "STRING"},
{"mode": "NULLABLE","name": "Order_Mode_Label","type": "STRING"},
{"mode": "NULLABLE","name": "checkClosetime","type": "TIMESTAMP" },
{"mode": "REPEATED",
"name": "Alterations",
"type": "RECORD",
"fields": [
{"mode": "NULLABLE","name": "Alteration_Product_Code", "type": "STRING" },
{"mode": "NULLABLE", "name": "Alteration_Product_Net_Amount", "type": "FLOAT" },
{"mode": "NULLABLE", "name": "Altered_Product_Code","type": "STRING" },
{"mode": "NULLABLE", "name": "Alteration_Product_Amount", "type": "FLOAT" },
{"mode": "NULLABLE", "name": "Altered_Product_Name", "type": "STRING" },
{"mode": "NULLABLE", "name": "Alteration_Product_Name", "type": "STRING" },
{"mode": "NULLABLE","name": "Alteration_Product_Qty", "type": "FLOAT" }
]
},
{"mode": "NULLABLE","name": "StoreId","type": "STRING"},
{"mode": "NULLABLE", "name": "dob","type": "STRING"},
{"mode": "NULLABLE","name": "Order_Mode_Id","type": "STRING"},
{"mode": "NULLABLE","name": "checknumber","type": "STRING" }
]
What I have tried:
- Removed all the white spaces and carriage return characters from the json flowflie
- Tried the array at the start and end of the json file and matched the BigQuery table to this.
- Changed the Names of the items to not include underscores.
- Built the insert one column at a time and only fails once the array is present.
- Used the schema definition generated from Google Cloud BigQuery directly when using the utility "bq show --schema --format=prettyjson.
- Defined the mode for each of the array items to REQUIRED instead of NULLABLE.
- Removed all the white spaces and carriage return characters from the schema definition in the properties of the PutBigQueryBatch process in Nifi
All resulted in getting the same error as stated above
Note Elsewhere in my nifi process I successfully use the PutBigQueryBatch process and update the table. The difference being that json does not have an array present.
Does anyone have any ideas on how to fix the issue?
Alterations
array not closed. – daggett