I'm working with Google BigQuery using Python application.
I have a dataframe with a field which contains lists, let's call it "keywords". I also have a BigQuery table whose keywords field is STRING and mode=REPEATED.
This is the schema of my BigQuery table:
SCHEMA = [
bq.SchemaField("id", "STRING", mode="NULLABLE"),
bq.SchemaField("fecha", "DATE", mode="NULLABLE"),
bq.SchemaField("keywords", "STRING", mode="REPEATED")
]
And this is my code:
import pandas as pd
from datetime import date
from google.cloud import bigquery as bq
df_dict = {
"id": ["asdf173","qwer783","vcda619"],
"fecha": [date(2019,1,15), date(2019,1,28), date(2019,2,12)],
"keywords": [['a','b'], ['c','d','e'],['f']]
}
df = pd.DataFrame(df_dict)
client = bq.Client()
dataset = client.dataset(dataset_name)
table_ref = dataset.table(table_name)
client.load_table_from_dataframe(df, table_ref).result()
I'm getting the following error when I try to upload the dataframe into the BigQuery table:
400 Provided Schema does not match Table project-id:dataset-name.table-name. Field keywords has changed type from STRING to RECORD.
How can I solve it?