5
votes

I have a user submission form that includes images. Originally I was using Carrierwave, but with that the image is sent to my server for processing first before being saved to Google Cloud Services, and if the image/s is/are too large, the request times out and the user just gets a server error.

So what I need is a way to upload directly to GCS. Active Storage seemed like the perfect solution, but I'm getting really confused about how hard compression seems to be.

An ideal solution would be to resize the image automatically upon upload, but there doesn't seem to be a way to do that.

A next-best solution would be to create a resized variant upon upload using something like @record.images.first.variant(resize_to_limit [xxx,xxx]) #using image_processing gem, but the docs seem to imply that a variant can only be created upon page load, which would obviously be extremely detrimental to load time, especially if there are many images. More evidence for this is that when I create a variant, it's not in my GCS bucket, so it clearly only exists in my server's memory. If I try

@record.images.first.variant(resize_to_limit [xxx,xxx]).service_url

I get a url back, but it's invalid. I get a failed image when I try to display the image on my site, and when I visit the url, I get these errors from GCS:

The specified key does not exist. No such object.

so apparently I can't create a permanent url.

A third best solution would be to write a Google Cloud Function that automatically resizes the images inside Google Cloud, but reading through the docs, it appears that I would have to create a new resized file with a new url, and I'm not sure how I could replace the original url with the new one in my database.

To summarize, what I'd like to accomplish is to allow direct upload to GCS, but control the size of the files before they are downloaded by the user. My problems with Active Storage are that (1) I can't control the size of the files on my GCS bucket, leading to arbitrary storage costs, and (2) I apparently have to choose between users having to download arbitrarily large files, or having to process images while their page loads, both of which will be very expensive in server costs and load time.

It seems extremely strange that Active Storage would be set up this way and I can't help but think I'm missing something. Does anyone know of a way to solve either problem?

2
In my opinion, dump active_storage and use the shrine gem, which will handle highly customised derivative options for you. See it here: shrinerb.com/docs/plugins/derivativesBenKoshy
@BKSpurgeon Gave it a shot, but documentation Google Cloud Services unfortunately appears to be nonexistent.Joe Morano
shrine is cloud agnostic. You'll have to rely on a third party gem for that: github.com/renchap/shrine-google_cloud_storage - it should work out of the box with (minor) modifications to make the API conform. I am 99% sure that the author is also using derivatives. I can't imagine it taking more than 20 min to get this to work.BenKoshy
@BKSpurgeon Do you know of a GCS solution for Uppy? That also only seems to have S3 documentation.Joe Morano
uppy should work with google's apis with only minor changes to the shrine sample code -- i've succeeded in uploading files to Google's cloud storage using uppy and shrine, but in my particular use case, I simply switched to AWS, without too much trouble (given my particular use case). having said that: it could potentially be a stumbling point if the AWS API changes - i'm not sure if the uppy dev team have supporting Google as a high priority, nor do I know if the community has chipped in with pull requests to support GCS.BenKoshy

2 Answers

2
votes

Here's what I did to fix this:

1- I upload the attachment that the user added directly to my service provider ( I use S3 ).

2- I add an after_commit job that calls a Sidekiq worker to generate the thumbs

3- My sidekiq worker ( AttachmentWorker ) calls my model's generate_thumbs method

4- generate_thumbs will loop through the different sizes that I want to generate for this file

Now, here's the tricky part:

def generate_thumbs
  [
    { resize: '300x300^', extent: '300x300', gravity: :center },
    { resize: '600>' }
  ].each do |size|
    self.file_url(size, true)
  end
end

def file_url(size, process = false)
  value = self.file # where file is my has_one_attached
  
  if size.nil?
    url = value
  else
    url = value.variant(size)

    if process
      url = url.processed
    end
  end

  return url.service_url
end

In the file_url method, we will only call .processed if we pass process = true. I've experimented a lot with this method to have the best possible performance outcome out of it.

The .processed will check with your bucket if the file exists or not, and if not, it will generate your new file and upload it.

Also, here's another question that I have previously asked concerning ActiveStorage that can also help you: ActiveStorage & S3: Make files public

0
votes

I absolutely don't know Active Storage. However, a good pattern for your use case is to resize the image when it come in. For this

  • Let the user store the image in Bucket1
  • When the file is created in Bucket1, an event is triggered. Plug a function on this event
  • The Cloud Functions resizes the image and store it into Bucket2
  • You can delete the image in Bucket1 at the end of the Cloud Function, or keep it few days or move it to cheaper storage (to keep the original image in case of issue). For this last 2 actions, you can use Life Cycle to delete of change the storage class of files.

Note: You can use the same Bucket (instead of Bucket1 and Bucket2), but an event to resize the image will be sent every time that a file is create in the bucket. You can use PubSub as middleware and add filter on it to trigger your function only with the file is created in the correct folder. I wrote an article on this