GCP Cloud Function to write data to BigQuery runs with success but data doesn't appear in BigQuery table

Question

I am running the following cloud function. It runs with success and indicates data was loaded to the table. But when I query the BigQuery no data has been added. I am getting no errors and no indication that it isn't working.

from google.cloud import bigquery
import pandas as pd


def download_data(event, context):

     df = pd.read_csv('https://covid.ourworldindata.org/data/ecdc/full_data.csv')

     # Create an empty list 
     Row_list =[] 

     # Iterate over each row 
     for index, rows in df.iterrows(): 
          # Create list for the current row 
          my_list =[rows.date, rows.location, rows.new_cases, rows.new_deaths, rows.total_cases, rows.total_deaths] 
          #print(my_list)     
     # append the list to the final list 
     Row_list.append(my_list) 


     ## Get Biq Query Set up
     client = bigquery.Client()
     table_id = "<project_name>.raw.daily_load"
     table = client.get_table(table_id)

     print(client)
     print(table_id)
     print(table)


     errors = client.insert_rows(table, Row_list)  # Make an API request.
     if errors == []:
          print("New rows have been added.")

Attempted so far;

Check data was being pulled -> PASSED, I printed out row_list and data is there
Run locally from my machine -> PASSED, data appeared when I ran it from a python terminal
Print out the table details -> PASSED, see attached screenshot it all appears in the logs
Confirm it is able to find the table -> PASSED, I changed the name of the table to one that didn't exist and it failed

Not sure what is next, any advice would be greatly appreciated

Are the machine service account used to run this pyhton script the same used to query the data? — Claudio
I am querying from the GUI so my own credentials and the cloud function is an app engine default service account — Cameron Wasilewsky
How do you query BigQuery? Are you perform a "preview" on the table? Or do you perform a real SQL query? — guillaume blaquiere

Daniel Ocando Daniel Ocando · Accepted Answer · 2020-03-25T13:13:18

Assuming that the App Engine default service account has the default Editor role assigned and that you have a very simple schema for the BigQuery table. For example:

Field name      Type          Mode       Policy tags    Description
date            STRING        NULLABLE  
location        STRING        NULLABLE  
new_cases       INTEGER       NULLABLE  
new_deaths      INTEGER       NULLABLE  
total_cases     INTEGER       NULLABLE  
total_deaths    INTEGER       NULLABLE

The following modification of your code should work for an HTTP triggered function. Notice that you were not including the Row_list.append(my_list) in the for loop to populate your list with the elements and that according to the samples on the documentation you should be using a list of tuples:

from google.cloud import bigquery
import pandas as pd

client = bigquery.Client()
table_id = "[PROJECT-ID].[DATASET].[TABLE]"

def download_data(request):

     df = pd.read_csv('https://covid.ourworldindata.org/data/ecdc/full_data.csv')

     # Create an empty list 
     Row_list =[] 

     # Iterate over each row 
     for index, rows in df.iterrows(): 
          # Create list for the current row 
          my_list =(rows.date, rows.location, rows.new_cases, rows.new_deaths, rows.total_cases, rows.total_deaths)  
          # append the list to the final list 
          Row_list.append(my_list) 


     ## Get Biq Query Set up
     table = client.get_table(table_id)

     errors = client.insert_rows(table, Row_list)  # Make an API request.
     if errors == []:
          print("New rows have been added.")

With the very simple requirements.txt file:

# Function dependencies, for example:
# package>=version
pandas
google-cloud-bigquery

GCP Cloud Function to write data to BigQuery runs with success but data doesn't appear in BigQuery table

1 Answers