0
votes

So the google ferris2 framework seems to exclusively use the blobstore api for the Upload component, making me question whether it's possible to make images uploaded to cloud storage public without having to write my own upload method and abandoning the use of the Upload component altogether, which also seems to create compatibility issues when using the cloud storage client library (python).

Backstory / context

  1. using- google App engine, python, cloud storage client library

Requirements 0.5 We require that blob information nor the file be stored in the model. We want a public cloud serving url on the model and that is all. This seems to prevent us from using the normal ferris approach for uploading to cloud storage.

Things I already know / road blocks One of the big roadblocks is dealing with Ferris using cgi / the blobstore api for field storage on the form. This seems to cause problems because so far it hasn't allowed sending data to to be sent to cloud storage through the google cloud storage python client.

Things we know about the google cloud storage python client and cgi: To write data to cloud storage from our server, cloud storage needs to be called with cloudstorage.open("/bucket/object", "w", ...), (a cloud storage library method). However, it appears so far that a cgi.FieldStorage is returned from the post for the wtforms.fields.FileField() (as shown by a simple "print image" statement) before the data is applied to the model, after it is applied to the model, it is a blob store instance.

I would like verification on this: after a lot of research and testing , it seems that because ferris is limited to the blobstore api for the uploads component, using the blob store api and blob keys to handle uploads seems basically unavoidable without having to create a second upload function just for the cloud storage call. Blob instances seem not to be compatible with that cloud storage client library, and it seems there is no way to get anything but meta data from blob files (without actually making a call to cloud storage to get the original file). However, it appears that this will not require storing extra data on the server. Furthermore, I believe it may be possible to get around the public link issue by setting the entire bucket to have read permissions.

Clarifying Questions: 1. To make uploaded images available to the public via our application, (any user, not an authenticated user), will I have to use the the cloudstorage python client library, or is there a way to do this with the blobstore api?

  1. Is there a way to get the original file from a blob key (on save with the add action method) without actually having to make a call to cloud storage first, so that the file can be uploaded using that library?

  2. If not, is there a way to grab the file from the cgi.FieldStorage, then send to cloud storage with the python client library? It seems that using cgi.FieldStorage.value is just meta data and not the file, same with cgi.FieldStorage.file.read()

2

2 Answers

0
votes

1) You cannot use the GAE GCS client to update an ACL.

2) You can use the GCS json API after the blobstore upload to GCS and change the ACL to make it public. You do not have to upload again. See this example code which inserts an acl.

3) Or use cgi.Fieldstorage to read the data (< 32 Mb) and write it to GCS using GAE GCS client.

import cloudstorage as gcs
import mimetypes

class UploadHandler(webapp2.RequestHandler):

    def post(self):

        file_data = self.request.get("file", default_value=None)
        filename = self.request.POST["file"].filename
        content_type = mimetypes.guess_type(self.filename)[0]

        with gcs.open(filename, 'w', content_type=content_type or b'binary/octet-stream',
                      options={b'x-goog-acl': b'public-read'}) as f:
            f.write(file_data)

A third method: use a form post upload with a GCS signed url and a policy document to control the upload.

And you can always use a public download handler, which reads files from the blobstore or GCS.

0
votes

You can now specify the ACL when uploading a file from App Engine to Cloud Storage. Not sure how long it's been in place, just wanted to share:

    filename = '/' + bucket_name + '/Leads_' + newUNID() + '.csv'

    write_retry_params = gcs.RetryParams(backoff_factor=1.1)
    gcs_file = gcs.open(filename,
                        'w',
                        content_type='text/csv',
                        options={'x-goog-acl': 'public-read'},
                        retry_params=write_retry_params)

docs: https://cloud.google.com/storage/docs/xml-api/reference-headers#standard