1
votes

I was exploring option to load Firestore Native Mode data (collection and documents) into BQ. But its not working out for me.

Question: Does Big Query support import of extract from Firestore Native export?

Setup: 1 collection with multiple documents (no sub collections).

Steps: - Export to Cloud Bucket: https://firebase.google.com/docs/firestore/manage-data/export-import - Import in BQ: https://cloud.google.com/bigquery/docs/loading-data-cloud-firestore

Error While loading in BQ: 'Does not contain valid backup metadata'

Analysis: Its mentioned in the link that URI should have KIND_COLLECTION_ID and that file should end with [KIND_COLLECTION_ID].export_metadata. But none of these are true for Firestore Native mode export file. Its applicable for Firestore Datastore mode export.

  • Verify [KIND_COLLECTION_ID] is specified in your Cloud Storage URI. If you specify the URI without [KIND_COLLECTION_ID], you receive the following error: does not contain valid backup metadata. (error code: invalid)
  • The URI for your Cloud Firestore export file should end with [KIND_COLLECTION_ID].export_metadata. For example: default_namespace_kind_Book.export_metadata. In this example, Book is the collection ID, and default_namespace_kind_Book is the file name generated by Cloud Firestore
2
I tried to recreate your issue but it worked just fine for me. Can you 100% check which file you used from your GCS bucket? For my test, the full path was [Bucket]/[Date-time]/all_namespaces/kind_[collection]/all_namespaces_kind_[collection].export_metadata ... - Kolban
Thanks for responding! Are you sure that you are using Native mode or is it Datastore mode. On export to Bucket from firestore, GCP creates a folder like this everytime: 2019-10-24T12:13:17_27544/ Folder has following for me: File - 2019-10-30T13:12:58_36484.overall_export_metadata Folder - export0/ contatins 2 files: export0.export_metadata and output-0 Folder - export1/ contatins 2 files: export1.export_metadata and output-0 - Ayush
I am pretty sure I was using Firestore native mode. Can you update the question with the exact command you used to export the data? - Kolban
Same as mentioned in link above, gcloud beta firestore export gs://firestore_export - Ayush
What is the exact path and full name of the GCS object you used for import into BQ? - Kolban

2 Answers

1
votes

When one creates an export of firestore collections to GCS, a directory structure is created that looks like:

[Bucket]
  - [Date/Time]
    - [Date/Time].overall_export_metadata
    - all_namespaces
      - kind_[collection]
        - all_namespaces_kind_[collection].export_metadata

When one imports an export into BigQuery, use the file:

[Bucket]/[Date/Time]/all_namespaces/kind_[collection]/all_namespaces_kind_[collection].export_metadata

Specifically, if one uses [Bucket]/[Date/Time]/[Date/Time].overall_export_metadata you will get the error you described. See also the note here under Console > Bullet 3 which reads:

Note: Do not use the file ending in overall_export_metadata. This file is not usable by BigQuery.

0
votes

If you want to create a pipeline from Firestore to Bigquery you should manualy format the Firestore collection to a Bigquery Table. I have used gcloud scheduler, cloud functions and firestore batched operations to migrate the data from Firestore to Bigquery. I created an example code here