0
votes

Although Google Cloud Storage is a flat object store that doesn't need directory entries, adding psuedo directory placeholders (empty entries with names ending in /) makes gcsfuse a lot faster. You can leave out the gcsfuse --implicit-dirs option and browse your GCS directories with very workable performance, which is not the case without the placeholders.

Q. Is there a way to issue a command to gsutil like gsutil cp -r your_directory gs://your-bucket/ that will create the directory placeholders while uploading files?

The alternative is to call the GCS API, but gsutil has a lot of useful features including parallel uploads and retry handling.

Example

Make the local tree:

$ mkdir -p your_directory/subdir
$ echo hi > your_directory/hi.txt
$ echo there > your_directory/subdir/there.txt

$ ls -lR your_directory
total 8
-rw-r--r--  1 jerry  staff   3 Jan 21 17:24 hi.txt
drwxr-xr-x  3 jerry  staff  96 Jan 21 17:24 subdir/

your_directory/subdir:
total 8
-rw-r--r--  1 jerry  staff  6 Jan 21 17:24 there.txt

gsutil copy it to GCS:

$ gsutil cp -r your_directory gs://your-bucket/
Copying file://your_directory/hi.txt [Content-Type=text/plain]...
Copying file://your_directory/subdir/there.txt [Content-Type=text/plain]...
/ [2 files][    9.0 B/    9.0 B]
Operation completed over 2 objects/9.0 B.

$ gsutil ls -lr gs://your-bucket/your_directory
gs://your-bucket/your_directory/:
         3  2020-01-22T01:25:38Z  gs://your-bucket/your_directory/hi.txt

gs://your-bucket/your_directory/subdir/:
         6  2020-01-22T01:25:38Z  gs://your-bucket/your_directory/subdir/there.txt
TOTAL: 2 objects, 9 bytes (9 B)

Notice that gsutil only created 2 objects (blobs) -- the text files. It did not create directory placeholder blobs your_directory/ or your_directory/subdir/.

In a gcsfuse your-bucket your-bucket mount:

$ find your_directory
find: your_directory: No such file or directory

In a gcsfuse --implicit-dirs your-bucket your-bucket mount:

$ find your_directory
your_directory
your_directory/hi.txt
your_directory/subdir
your_directory/subdir/there.txt

slowly.

Back to a gcsfuse your-bucket your-bucket mount, we can make the text files show up by creating the directory placeholders:

$ mkdir your_directory
$ ls your_directory
hi.txt

$ mkdir your_directory/subdir
$ ls your_directory
hi.txt  subdir/

$ ls your_directory/subdir/
there.txt
1
Your performance issue is about the binding of your bucket to your local directory with gcsfuse?guillaume blaquiere
@guillaumeblaquiere Yes, with --implicit-dirs, listing even a tiny directory of 2 files takes seconds. gcsfuse is impractical that way.Jerry101

1 Answers

-1
votes

If I understood correctly and you want to upload files while creating what appear to be empty folders (which in the background are just empty files with a "/" at the end of their path), gsutil cp -r your_directory gs://your-bucket/ does the trick.

For reference here is how subdirectories work in GCS and gsutil cp command