0
votes

I tried to clone a firestore database. I found a guide on this topic (https://qrc.ninja/2019/03/20/cloning-firestore-data/) so I tried to complete the steps in this guide.

To export the database I did the following:

gcloud config set project [PROJECT_ID]
gcloud firestore export gs://[BUCKET_NAME]

To import the database I did the following:

gcloud config set project [DESTINATION_PROJECT_ID]
gsutil acl ch -u [RIGHTS_RECIPIENT]:R gs://[BUCKET_NAME]
gcloud firestore import gs://[BUCKET_NAME]/[TIMESTAMPED_DIRECTORY]

The last step (gcloud firestore import ...) resulted in this error:

ERROR: (gcloud.firestore.import) Entity too large

I searched for this problem, but I could only find in a cached google result of this page: https://cloud.google.com/datastore/docs/export-import-entities There it says:

An import operation updates entity keys and key reference properties in the import data with the project ID of the destination project. If this update increases your entity sizes, it can cause "entity is too big" or "index entries too large" errors for import operations. To avoid either error, import into a destination project with a shorter project ID.

My project ID looks like this: XX-XXXXX-XXXXXXX. It is 16 characters long. As I need a paid plan for my project, simply testing with a shorter name won't be for free.

So I would be grateful for any hints on if the ID is really the problem or if I could try something else to clone my database.

Update: I can clone the database, by exporting/importing single collection. But one of my collections has over 79000 documents. When I do an export of this large collection and try to import it, I still get

ERROR: (gcloud.firestore.import) Entity too large
1

1 Answers

0
votes

This kind of issues are usually related to Entities that somehow grow over the allowed size and when trying to restore a DB (from an export and then an import), issues arises. The issue is located in the import given that the export doesn't have any restrictions. The Project Id shouldn't have to do anything with the issue.

A way in which you can check this is that you import your data in BigQuery and inspect the larger entities yourself. The Cloud Datastore entities should respect the limits set here, in particular the size of entity. The size of an entity is the sum of:

  • The key size
  • The sum of the property sizes
  • 32 bytes

You can either check the size of each entity manually, by writing a script or by loading the data in Big Query. The calculation of the size of entity is defined in the URL at here.

Additionally, you can run the command:

gcloud datastore operations describe [OPERATION_ID]

with the import operation id to get more details.

I found this Public Issue Tracker. As far as is mentioned, this issue should be resolved by modifying affected entities.