3
votes

I would like to use a Google Cloud Function to quickly transfer data from a CSV file in GCS to a Postgres table in Cloud SQL.

Ideally I would use the GCP SDK to perform this action, but the official documentation suggests that the only way to perform the import is 1) console, 2) gcloud command, or 3) curl. I've adopted the curl approach using requests in Python. The code I've written below (I've omitted the Cloud Function requests wrapper) works on my laptop but doesn't work in Cloud Function. The Cloud Function completes without acknowledging an error but my data is never loaded into the table, unlike when I run the code from my laptop.

import google.auth.transport.requests
import json
import requests


credentials, project = google.auth.default()

"""
According to the docs, hitting the REST endpoint requires an 
access token to be passed in the request header. It was not
clear how to obtain an access token without resorting to
gcloud commands, but I finally stumbled across a solution
here: https://stackoverflow.com/a/55804230/554481

At this point in the code credentials.valid is False, and 
credentials.token is None, so I need to perform a refresh
to populate them.
"""
auth_req = google.auth.transport.requests.Request()
credentials.refresh(auth_req)

# Now you can use credentials.token
access_token = credentials.token

# Removed business details for SO post
instance_name = '<removed>'
bucket_name = '<removed>'
project_id = '<removed>'
gcs_path = '<removed>'
database = '<removed>'
table_name = '<removed>''

headers = {
    'Content-Type':'application/json',
    'Authorization':'Bearer '+str(access_token)
}
endpoint = 'https://www.googleapis.com/sql/v1beta4/projects/{project_id}/instances/{instance_name}/import'.format(
    project_id=project_id,
    instance_name=instance_name
)
gcs_location = 'gs://{bucket_name}/{gcs_path}'.format(
    bucket_name=bucket_name,
    gcs_path=gcs_path
)
json_payload = {
    "importContext": {
        "fileType": "CSV",
        "uri": gcs_location,
        "database": database,
        "csvImportOptions":{
            "table":table_name
        }
    }
}
requests.post(
    url=endpoint,
    data=json.dumps(json_payload),
    headers=headers
)
1
What is the error? Your code does not handle errors, so how do you know?John Hanley
Your Cloud Functions service account needs the permission cloudsql.instances.import which is part of the Cloud SQL Admin role. Double check the roles assigned to the service account that Cloud Functions is using (the default is the App Engine service account).John Hanley

1 Answers

1
votes

Because you aren't checking the response, the post request might be executing successfully and still returning an error code. If you take a look a the Instance: import documentation, there is an example of making a request with the discovery client:

from googleapiclient import discovery
from oauth2client.client import GoogleCredentials

credentials = GoogleCredentials.get_application_default()

service = discovery.build('sqladmin', 'v1beta4', credentials=credentials)

# Project ID of the project that contains the instance.
project = 'my-project'  # TODO: Update placeholder value.

# Cloud SQL instance ID. This does not include the project ID.
instance = 'my-instance'  # TODO: Update placeholder value.

instances_import_request_body = {
    # TODO: Add desired entries to the request body.
}

request = service.instances().import_(project=project, instance=instance, body=instances_import_request_body)
response = request.execute()

# TODO: Change code below to process the `response` dict:
pprint(response)

The results from response will show you if the action was successful or not.