0
votes

Here's the scenario.

User uploads a zip file from a form. On the backend, I get the ZipInputStream and convert the inputstream to bytes and upload to GCS

`public String upload(
      String bucketName,
      String objectName,
      String contentType,
      InputStream objectInputStream)
      throws IOException {
    if (contentType == null) contentType = ContentType.CONTENT_TYPE_TEXT_PLAIN_UTF8;

    BlobId blobId;
    if (largeFile) {
      blobId = BlobId.of(bucketName, objectName);
      BlobInfo blobInfo = BlobInfo.newBuilder(blobId).setContentType(contentType).build();
      WriteChannel writer = storage.writer(blobInfo);
      if (storage.get(blobId) != null) writer = storage.update(blobInfo).writer();

      byte[] buffer = new byte[1024];
      int limit;
      while ((limit = objectInputStream.read(buffer)) >= 0) {
        try {
          writer.write(ByteBuffer.wrap(buffer, 0, limit));
        } catch (Exception e) {
          logger.error("Exception uploadObject", e);
        }
      }
      writer.close();
    } else {
      byte[] objectBytes = ByteStreams.toByteArray(objectInputStream);
      blobId = storeByteArray(storage, bucketName, objectName, contentType, objectBytes);
      if (Objects.isNull(blobId)) return null;
    }
    return url(bucketName, objectName);
  }`

COde that gets the filepart and calls the above method

ZipInputStream filePartInputStream = new ZipInputStream(filePart.getInputStream());
storageGateway.uploadObject(
          "bucket_name",
          "objectname",
          filePart.getContentType(),
          filePartInputStream
       );

The upload works as expected but when I download the zip folder from GCS bucket, it seems to be corrupted. I was not able to unzip it.

Am I missing anyhting here ? If not what's the correct way to upload a zip file to google cloud storage

1
What is content-encoding set on the uploaded object? One situation I've seen is that the client or library (or browser) auto-decodes based on this header. And (assuming you're running on a Linux machine locally) what is the output of running file downloaded_file? - Mike Schwartz
@MikeSchwartz I set the contentEncoding of the blob to "gzip" while uploading the file. But the outcome was the same. Locally i'm running this on a MAC. Edit : I ran the command file file_name.zip. The output was file_name.zip: empty - Valkyrie
If you're not going to unpack it on the fly, there's no need to use ZipInputStream, just pass the filePart.getInputStream to the storageGateway. I don't remember the ZipInputStream API, but it is possible that your current solution is actually unpacking zip on-the-fly and you're writing not-a-zip file to the storage. - xSAVIKx
Note: The content-encoding for zip files is application/zip and not gzip. - John Hanley
@Valkyrie after uploading, are you able to download the file using gsutil cp, and get a complete zip file like you originally uploaded? That would tell you the upload worked successfully and it was your download path that's having problems. - Mike Schwartz

1 Answers

0
votes

Posting this as Community Wiki answer, based in the comment provided by @Valkyrie, informing what she did to fix it.

The solution is to convert the fileInptStream to a byteArray and then, convert the byteArray to a byteArrayInputStream as below :

byte[] data = IOUtils.toByteArray(filePartInputStream) 
ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(data)

This way, once the file is downloaded after being uploaded to Cloud Storage, the file should not be corrupted.