0
votes

I have a cloud function triggered by a storage bucket on google.storage.object.finalize. In this function, I download the file which triggered the function and perform some transforms on the data. If this file is ~5KB, the function executes as expected. If the file is ~200MB, the file.download callback never runs and Stackdriver shows nothing but Function execution took 14 ms, finished with status: 'ok'.

I understand that there is a 10MB limit on files uploaded via HTTP to HTTP-triggered functions but this function is triggered by Cloud Storage. The answer given to this question states that the 10MB limit is not imposed on storage-triggered functions so I suspect a possible timeout issue.

The function is set to a 2GB memory limit, 5 minute timeout, and all buckets/functions are in the same region. Surely this would be enough resources to transfer and process a 200MB file (profiling locally shows the process completing in a few seconds while remaining under 512MB memory).

const { Storage } = require('@google-cloud/storage');
const storage = new Storage();
const uploadBucket = storage.bucket("<bucket name>");

module.exports = (data, context) => {

    console.log('Getting file ref');  // Logged regardless of file size
    const file = uploadBucket.file(<file name>);

    file.download(function (err, buffer) {
        console.log('Starting file processing');
        // With a smaller file, this callback runs as expected.
        // With larger files, this code is never reached.
    });
};

Do I have an incorrect understanding of the bandwidth available between the function and the storage bucket or does this suggest another issue?

1
If you are posting code, it is better to post the actual code that you are using. Otherwise, we can only guess. Also, your code has no error handling. Start by writing good solid code and then test and debug. - John Hanley
Your function doesn't seem to be returning a promise that resolves only after all the async work is complete. Without that, the function will be terminated and cleaned up early before the work finishes. - Doug Stevenson
@DougStevenson - well that's a palm-to-forehead moment if ever there were! The returned promise is exactly what I was missing. I had gotten so fixated on the file.download() method that I forgot to consider the termination of the Cloud Function itself. If you'd like to leave an answer, I'll be glad to mark as accepted. - user2864874

1 Answers

0
votes

I was able to copy a file of 286.15 MB from a source bucket to a destination bucket. This is my cloud function.

exports.helloGCSGeneric = (data, context) => {

var storage = require('@google-cloud/storage')();


const bucket = storage.bucket(data.bucket);
const file = bucket.file(data.name); // file has couple of lines of text

var anotherBucket = storage.bucket('destination_bucket');

 file.copy(anotherBucket, function(err, copiedFile, apiResponse) {
});

 // or you can use 
 var newLocation = 'gs://another-bucket/new_file';
 file.copy(newLocation).then(function(data) {
 var newFile = data[0];
 var apiResponse = data[1];
 });
};

This is my package.json file:

 {
   "name": "sample-cloud-storage",
   "version": "0.0.1",
   "dependencies": {
   "@google-cloud/storage": "^1.6.0"
  }
 }

I just trigger the function:

gsutil cp filename  gs://source_bucket/

According to the documetation Quotas & limits:

There is a maximum size limit of 5 TB for individual objects stored in Cloud Storage.