2
votes

I am trying to append data to a table in BigQuery using Pandas and google-cloud-big query. The problem I am having is I can create a table initially using if_exists=append but when I rerun the exact same query, I get an error -

"pandas_gbq.gbq.InvalidSchema: Please verify that the structure and data types in the DataFrame match the schema of the destination table."

Here is the code I am running:

import pandas as pd
from datetime import date
import pandas_gbq
from google.cloud import bigquery

output = [['Test', 'v1', 'cost_center', '123', date(2020, 1, 31), 30.0],
          ['Test', 'v1', 'cost_center', '345', date(2020, 1, 31), 72.0]]

headers = ['scenario', 'version', 'entity', 'account', 'period', 'amount']
df_output = pd.DataFrame(output, columns=headers)

dataset_table = 'my_dataset'
project_id ='my_project_id'
table_schema = [{'name':'scenario', 'type':'string'},
                {'name':'version', 'type':'string'},
                {'name':'entity', 'type':'string'},
                {'name':'account', 'type':'string'},
                {'name':'period', 'type':'date'},
                {'name':'amount', 'type':'float'}
]
df_output.to_gbq(destination_table=dataset_table, 
                 project_id= project_id, 
                 if_exists='append',
                 table_schema=table_schema)

It looks as if this issue has been encountered before, here. But there is no answer. Other questions on SO dealt with issues where there was a type difference between the dataframe schema and the BQ table schema.

To simplify the problem further, I tried making all columns have 'type':'String', but I get the same error message.

Thank you for the assistance.

Here is a screenshot of the table schema in BQ UI after script runs once.

Screenshot of the table schema in BQ UI

1
Can you share your pandas version? - Nick_Kh
I am using Pandas 1.0.3 and Pandas-gbq 0.13.1. - Andrew
also using python 3.8.2 on Windows 10 machine - Andrew
I conversed with someone maintaining Pandas-gbq. He suggested I upgrade to version 0.13.2 which should solve the problem. I did the upgrade, but the problem remains. Current version is now Panda-gbq 0.13.2 - Andrew
i think you should write 'float64' as type for 'amount'. There is no Float type in BigQuery. - Priya Agarwal

1 Answers

1
votes

I can repro the same error as you and after placing some debug output in pandas library (which you can do the same), when type names are in upper case it works for me

table_schema = [{'name':'scenario', 'type':'STRING'},
                {'name':'version', 'type':'STRING'},
                {'name':'entity', 'type':'STRING'},
                {'name':'account', 'type':'STRING'},
                {'name':'period', 'type':'DATE'}, 
                {'name':'amount', 'type':'FLOAT'}