Let's say I have a table with one single field named "version", which is a string. When I try to load data into the table using autodetect with values like "1.1" or "1", the autodetect feature infers these values as float or integer type respectively.
data1.json example:
{ "version": "1.11.0" }
bq load output:
$ bq load --autodetect --schema_update_option=ALLOW_FIELD_ADDITION --source_format=NEWLINE_DELIMITED_JSON temp_test.temp_table ./data1.json
Upload complete.
Waiting on bqjob_ZZZ ... (1s) Current status: DONE
data2.json example:
{ "version": "1.11" }
bq load output:
$ bq load --autodetect --schema_update_option=ALLOW_FIELD_ADDITION --source_format=NEWLINE_DELIMITED_JSON temp_test.temp_table ./data2.json
Upload complete.
Waiting on bqjob_ZZZ ... (0s) Current status: DONE
BigQuery error in load operation: Error processing job 'YYY:bqjob_ZZZ': Invalid schema update. Field version has changed type from STRING to FLOAT
data3.json example:
{ "version": "1" }
bq load output:
$ bq load --autodetect --schema_update_option=ALLOW_FIELD_ADDITION --source_format=NEWLINE_DELIMITED_JSON temp_test.temp_table ./data3.json
Upload complete.
Waiting on bqjob_ZZZ ... (0s) Current status: DONE
BigQuery error in load operation: Error processing job 'YYY:bqjob_ZZZ': Invalid schema update. Field version has changed type from STRING to INTEGER
The scenario where this problem doesn't happen is when you have, in the same file, another JSON where the value is inferred correctly as string (as seen in Bigquery autoconverting fields in data question):
{ "version": "1.12" }
{ "version": "1.12.0" }
In the question listed above, there's an answer stating that a fix was pushed to production, but it looks like the bug is back again. Is there a way/workaround to prevent this?