1
votes

I was exploring option to load Firestore Native Mode data (collection and documents) into BQ. But its not working out for me.

Question: Does Big Query support import of extract from Firestore Native export?

Setup: 1 collection with multiple documents (no sub collections).

Steps: - Export to Cloud Bucket: https://firebase.google.com/docs/firestore/manage-data/export-import - Import in BQ: https://cloud.google.com/bigquery/docs/loading-data-cloud-firestore

Error While loading in BQ: 'Does not contain valid backup metadata'

Analysis: Its mentioned in the link that URI should have KIND_COLLECTION_ID and that file should end with [KIND_COLLECTION_ID].export_metadata. But none of these are true for Firestore Native mode export file. Its applicable for Firestore Datastore mode export.

  • Verify [KIND_COLLECTION_ID] is specified in your Cloud Storage URI. If you specify the URI without [KIND_COLLECTION_ID], you receive the following error: does not contain valid backup metadata. (error code: invalid)
  • The URI for your Cloud Firestore export file should end with [KIND_COLLECTION_ID].export_metadata. For example: default_namespace_kind_Book.export_metadata. In this example, Book is the collection ID, and default_namespace_kind_Book is the file name generated by Cloud Firestore
2
I tried to recreate your issue but it worked just fine for me. Can you 100% check which file you used from your GCS bucket? For my test, the full path was [Bucket]/[Date-time]/all_namespaces/kind_[collection]/all_namespaces_kind_[collection].export_metadata ...Kolban
Thanks for responding! Are you sure that you are using Native mode or is it Datastore mode. On export to Bucket from firestore, GCP creates a folder like this everytime: 2019-10-24T12:13:17_27544/ Folder has following for me: File - 2019-10-30T13:12:58_36484.overall_export_metadata Folder - export0/ contatins 2 files: export0.export_metadata and output-0 Folder - export1/ contatins 2 files: export1.export_metadata and output-0Ayush
I am pretty sure I was using Firestore native mode. Can you update the question with the exact command you used to export the data?Kolban
Same as mentioned in link above, gcloud beta firestore export gs://firestore_exportAyush
What is the exact path and full name of the GCS object you used for import into BQ?Kolban

2 Answers

1
votes

When one creates an export of firestore collections to GCS, a directory structure is created that looks like:

[Bucket]
  - [Date/Time]
    - [Date/Time].overall_export_metadata
    - all_namespaces
      - kind_[collection]
        - all_namespaces_kind_[collection].export_metadata

When one imports an export into BigQuery, use the file:

[Bucket]/[Date/Time]/all_namespaces/kind_[collection]/all_namespaces_kind_[collection].export_metadata

Specifically, if one uses [Bucket]/[Date/Time]/[Date/Time].overall_export_metadata you will get the error you described. See also the note here under Console > Bullet 3 which reads:

Note: Do not use the file ending in overall_export_metadata. This file is not usable by BigQuery.

0
votes

If you want to create a pipeline from Firestore to Bigquery you should manualy format the Firestore collection to a Bigquery Table. I have used gcloud scheduler, cloud functions and firestore batched operations to migrate the data from Firestore to Bigquery. I created an example code here