2
votes

I want to combine multiple GCS files into one big file. According to the docs there is a compose function, which looks like it does exactly what I need: https://developers.google.com/storage/docs/json_api/v1/objects/compose

However, I can't find how to call that function from GAE using the Java client: https://developers.google.com/appengine/docs/java/googlecloudstorageclient/

Is there a way to do this with that library?

Or should I mess around with reading the files one by one using channels?

Or should I call the low level JSON methods?

What's the best way?

2

2 Answers

2
votes

Compose option available in the new Java client, I have tried using google-cloud-storage:1.63.0.

  /** Example of composing two blobs. */
  // [TARGET compose(ComposeRequest)]
  // [VARIABLE "my_unique_bucket"]
  // [VARIABLE "my_blob_name"]
  // [VARIABLE "source_blob_1"]
  // [VARIABLE "source_blob_2"]
  public Blob composeBlobs(
      String bucketName, String blobName, String sourceBlob1, String sourceBlob2) {
    // [START composeBlobs]
    BlobId blobId = BlobId.of(bucketName, blobName);
    BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setContentType("text/plain").build();
    ComposeRequest request =
        ComposeRequest.newBuilder()
            .setTarget(blobInfo)
            .addSource(sourceBlob1)
            .addSource(sourceBlob2)
            .build();
    Blob blob = storage.compose(request);
    // [END composeBlobs]
    return blob;
  }
1
votes

The compose operation does indeed do exactly what you want it to do. However, the compose operation isn't currently available for the GAE Google Cloud Storage client. You have a few alternatives.

You can use the non-GAE Google APIs client (link to the Java one). It invokes the lower level JSON API and supports compose(). The downside is that this client doesn't have any special AppEngine magic, so some little things will be different. For example, if you run it in the local development server, it will contact the real Google Cloud Storage. Also you'll need to configure it to authorize its requests, etc.

Another option would be to invoke the JSON or XML APIs directly.

Finally, if you only need to do this one time, you could simply use the command-line utility:

gsutil compose gs://bucket/source1 gs://bucket/source2 gs://bucket/output