0
votes

I'm trying to push data from a data frame to Google Big Query.

I set my date field of the data frame as

df['time'] = df['time'].astype('datetime64[ns]')

and I set Google's Big Query date to *DATETIME*. When I do the export from Python to GBQ, I get this error:

InvalidSchema: Please verify that the structure and data types in the DataFrame match the schema of the destination table.

If I make everything into string format, it works. I don't think you can just set a data frame field to just date, right? Is there a clever way to get this working, or do dates have to be set as strings?

TIA.

2
Could you print out the df['time'] and see if it matches the format BQ needs? You can even create a dateTime object of the same date and evaluate the two and see if they are considered equal. - Ying Li

2 Answers

1
votes

I found data loading with date and datetime type column is not working. So I tried with datatype timestamp and then could load the data in bigquery table.

while defining schema for date columns define it as timestamp as below.

bigquery.SchemaField('dateofbirth', 'timestamp')

and convert the dataframe column datatype from object to other datetime format which bigquery can understand.

df.dateofbirth=df.dateofbirth.astype('datetime64')

as of 8-mar-2019 date and datetime column type are not working.

0
votes

Changing datetime datatype to timestamp in biguery schema will give you a time value added with UTC. This might not be the ideal scenario for may of us. Rather try the below code:

job_config = bigquery.LoadJobConfig(
    schema=table_schema, source_format=bigquery.SourceFormat.CSV
)
load_job = bigquery_client.load_table_from_dataframe(
    dataframe, table_id, job_config=job_config
)