I want to upload a panda dataframe to Big Query using the Dataframe.to_gbq() function.
I specify a table_schema argument to force a specific column order in BigQuery (that might differ from Dataframe).
So I use for example :
table_schema = [{'name': 'col1', 'type': 'INT64'},
{'name': 'col2', 'type': 'STRING'},
{'name': 'col3', 'type': 'STRING'},
{'name': 'col4', 'type': 'STRING'},
{'name': 'col5', 'type': 'STRING'},
{'name': 'col6', 'type': 'FLOAT64'},
{'name': 'col7', 'type': 'INT64'},
{'name': 'col8', 'type': 'FLOAT64'}]
Dataframe.to_gbq(destination_table, if_exists='replace', table_schema=table_schema)
Colum order in Dataframe is : Col1, Col3,Col4, Col5, Col2, Col6, Col7,Col8
Job is done correctly.
But then when I check table schema of the created (or replaced) destination_table in Big Query, column order is : Col1, Col3,Col4, Col5, Col2, Col6, Col7,Col8
(order of the dataframe and not that of the table_schema)
Shouldn't the order specified in the table schema be respected ?
If not, is there a way to force that ?