Firebase data to Google BigQuery

Question

Firebase offers private backups on Google Cloud Storage. One of the featured use case is "Ingestion into Analytics Products":

Private Backups provides a perfect pipeline into cloud analytics products such as Google’s BigQuery. Cloud Analytics products often prefer to ingest data through Cloud Storage buckets rather than directly from the application.

I have a lot of data in Firebase (more than 1GB when exported to a Cloud Storage bucket) and, as described in Firebase offering, I wanted to put those data in Big Query.

But is it really possible to write a table schema that fits Firebase raw data? Let's take as an example the dinosaur-facts database from Firebase documentation. The JSON looks like this:

{
  "dinosaurs" : {
    "bruhathkayosaurus" : {
      "appeared" : -70000000,
      "height" : 25
    },
    "lambeosaurus" : {
      "appeared" : -76000000,
      "height" : 2.1
    }
  },
  "scores" : {
    "bruhathkayosaurus" : 55,
    "lambeosaurus" : 21
  }
}

To list all dinosaurs, I suppose the only way would be to use a RECORD field in bigQuery schema. But usually RECORDS in BigQuery correspond to an array in the imported JSON. And there's no array here in Firebase, just an object with dinosaur names as the key names.

So a BigQuery table schema like this doesn't work:

[
    {
        "name": "dinosaurs",
        "type": "RECORD",
        "mode": "REQUIRED",
        "fields": [
            {
                "name": "dinosaur",
                "type": "RECORD",
                "mode": "REPEATED",
                "fields": [
                    {
                        "name": "appeared",
                        "type": "INTEGER"
                    },
                    {
                        "name": "height",
                        "type": "INTEGER"
                    },
                    {
                        "name": "length",
                        "type": "INTEGER"
                    },
                    {
                        "name": "order",
                        "type": "STRING"
                    },
                    {
                        "name": "vanished",
                        "type": "INTEGER"
                    },
                    {
                        "name": "weight",
                        "type": "INTEGER"
                    }
                ]
            },
            {
                "name": "scores",
                "type": "RECORD",
                "mode": "REPEATED",
                "fields": [
                    {
                        "name": "dinosaur",
                        "type": "INTEGER"
                    }
                ]
            }
        ]
    }
]

Is it possible to write a table schema that fits Firebase raw data? Or should we first prepare the data to make it compatible with BigQuery?

David East David East · Accepted Answer · 2015-12-17T13:12:16

Since the data above is just JSON, you should be able to get it to work with Firebase. However, I think that it would be much easier to prepare the data after the backup.

You mentioned that there was no arrays in the Firebase data. Firebase does support arrays, but they have to meet a certain criteria.

// we send this
['a', 'b', 'c', 'd', 'e']
// Firebase stores this
{0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e'}
// since the keys are numeric and sequential,
// if we query the data, we get this
['a', 'b', 'c', 'd', 'e']

Even though it may look like an object in the Firebase database, it will come back as an array when queried.

So it is feasible to create your schema in your Firebase database, but it would likely create a lot of overhead for your application.

Firebase data to Google BigQuery

3 Answers