0
votes

I am going to develop an application that creates files on Google Cloud Storage and read them by other processes.

File creation may be delayed due to some reasons (such as the file is big) and may exist incomplete (write is ongoing) files on Cloud Storage.

I have to consider to prevent reading incomplete files. But according to this page, the Bucket listing is strongly consistent. The newly created files could be listed immediately after the file is created.

From the document above, my guess is the newly created files will not be listed until the creation will be completed, the incomplete files will not be listed.

Is my guess true? If not, how should I do to prevent reading incomplete files?

2

2 Answers

1
votes

Your guess is true, the write in the bucket is atomic (when you upload the file, the content is cached before being pushed to your bucket.) You can see this in the documentation

  • Read-after-write (i.e. atomic operation, no transient state)

Thus, you don't need to worry about incomplete files.

0
votes

I downloaded a large file from the internet (512 MB) and then uploaded it to GCS.

I tested by listing the bucket objects (during the upload) using the command

gsutil ls gs://bucket_name

The new object was not listed until the uploading process was successful.

Therefore your guess is true.