8
votes

I am using metadata heavily with my Google Cloud Storage Bucket, now i have a use case where i need to search the files using some of the metadata field values. I have looked at the entire documentation (https://cloud.google.com/storage/docs/gsutil/addlhelp/WorkingWithObjectMetadata) but i didn't find anything related to search on Metadata, is there any why i can do this or should i go ahead and build something myself?

2

2 Answers

6
votes

No, GCS doesn't offer a search feature.

2
votes

It is possible now to do so with APIs.

Set up CUSTOM metadata:

  • Visit Viewing and Editing Object Metadata documentation, for more information.
  • Use gsutil setmeta -h "[METADATA_KEY]:[METADATA_VALUE]" gs://[BUCKET_NAME]/[OBJECT_NAME] command.
  • [METADATA_KEY] should start with x-goog-meta- as stated in the documentation for custom metadata. So [METADATA_KEY] = x-goog-meta-[CUSTOM_VALUE]. NOTE no spaces in the value.
  • [METADATA_VALUE] should also be without spaces. Use _ or -.

Search objects based on CUSTOM metadata of objects using APIs.

This example is for Python, but there are a lot of other languages supported in the Listing Objects documentation.

  • List the objects in the bucket Sample Code
  • For every object found, list all the metadata. Sample Code
  • Instead of printing them, take the blob.metadata part only and convert it to string. If it is a default value it will be None otherwise it will be your set up custom metadata key in JSON format.
  • Use the value to perform a == operation to check if it is compatible with the value you are looking for and perform any actions accordingly.
  • NOTE: The custom values are saved without the x-goog-meta- prefix. It is used only to determine for the API call when setting metadata that this value is custom. Only the second part will be visible in metadata. e.g. you use x-goog-meta-test_metadata it will be visible as test_metadata.

I have did a little bit of coding myself and this is a Python example doing exactly what I described above. You can find the code in GitHub here.