1
votes

I've been dealing with very slow upload speeds from my application to S3. I have a rails application running in a single docker environment on Elastic Beanstalk and a specific bucket that stores files created by the user. Both are in the same region and availability zone. The files that are being uploaded are very small (< 1kb) text files that are taking 40 seconds on average to upload. This seems ludicrous to me considering I am not even transferring outside of the datacenter. Reading the files are near instant as is moving and deleting the files. Further more, 40 seconds seems to be the base amount of transfer time. I've tested this by uploading a 10 byte document and a 29kb document which both took the same amount of time.

I'm using the ruby aws-sdk to perform the upload that looks like this:

file = Tempfile.new(file_name)
file.write(@content)
key = "resources/#{file_name}"
s3 = Aws::S3::Resource.new(region: ENV["AWS_REGION"])
obj = s3.bucket(bucket_name).object(key)

logger.info "** Uploading file #{file_name} to S3"
logger.info " - File size is #{file.size} bytes"
start_time = Time.now.to_i
obj.upload_file(file)
end_time = Time.now.to_i
seconds = end_time - start_time
elapse = Time.at(seconds).utc.strftime("%H:%M:%S")
logger.info "** File upload took #{elapse} to complete"

and I'm seeing output like this:

** Uploading file untitled-NUB3eAURYspbpdaBqu.md to S3
  - File size is 23 bytes
** File upload took 00:00:41 to complete

I've exhausted my research ability on this issue after reading hundreds of other posts on SO, the aws forum and others. Any insight into how I can improve this would be greatly appreciated.

Update: added that I was using a Tempfile object and not a file path string. It was not clear from my previous code example.

2
Worth enabling wire logging (wire trace) and fullest debug/verbose options for the SDK to see if that shines a light on the problem? What upload time do you see from awscli on the same client to the same bucket? - jarmod
As @jarmod mentioned, try uploading the file from the terminal using the awscli and measure the time, it may be a temporal network problem - Edgar Ortega
Interesting... When enabling the wire trace, I can see that there are two attempts to upload. The first attempt fails with 400 Bad Request <Code>RequestTimeout</Code><Message>Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.</Message>. The second attempt completes successfully. Everything else looks normal. - Matt Grannary
@MattGrannary The issue is reproducible only for this exact bucket? And could you update question and attach these logs if possible - Martin Zinovsky

2 Answers

2
votes

Solution Found: After trying out a few options, I've discovered that the issue was passing a File object to the upload_file() method. Even though the aws documentation says this is acceptable, my issues went away when I switched to using the file.path instead.

0
votes

That solution worked for me. Thanks so much. To get this to work I had to flush the file contents (could have called close or rewind I guess). Below is my final solution to workaround this issue. Hope it helps someone else.

file = Tempfile.new
file.write("data")
file.flush

s3_object = Aws::S3::Object.new("bucket", "key")
s3_object.upload_file(file.path)