0
votes

I'm trying to decide on the best way to structure data for the following scenario. I have 10000 sheep on a farm with 50 interconnected fields. I track each sheep when they enter and leave a field. I want to be able to:

  • retrospectively analyse each sheep's movement over a given time period
  • I also want to analyse each field's use over time
  • instantly know which field any given sheep is currently in
  • instantly know which sheep are in a given field

I've read the documentation about de-normalising the data and I appreciate that I can do anything using queries etc.

My question is: Should I duplicate the entry/exit data to sheep & field nodes like this?:

{
    sightings: {
        uniqueSheepId: {
            FirebaseAutoId: {
                type: enter
                fieldId: uniqueFieldId
                timestamp: xxxxxxxx.xxxxx
            }
        }
    }
}

AND

{
    sightings: {
        uniqueFieldId: {
            FirebaseAutoId: {
                type: enter
                sheepId: uniqueSheepId
                timestamp: xxxxxxxx.xxxxx
            }
        }
    }
}

This seems like a good way for getting a realtime snapshot of how many sheep are in any given field. It also allows us to easily see where a sheep is right now and where its been without doing any querying. Obviously the size of the dataset will grow twice the size than if I used queries but does the simplicity of getting the data outweigh the cost of storing the data?

I've seen (and understand) the chatroom/members examples on SO but I think my desire to retrospectively analyse membership/usage from both the sheep and field perspectives makes my question slightly different to those answers. Any advice would be great.

1

1 Answers

1
votes

Your current data structure would suffice as this:

sightings: {
    uniqueSheepId: {
        FirebaseAutoId: {
            type: enter
            fieldId: uniqueFieldId
            timestamp: xxxxxxxx.xxxxx
        }
    }
}

and you would not need to duplicate. You can create a query for each requirement you have.

However, I have ready many times that is is better to create more flat trees in Firebase than to create fewer deep trees. The following recommended structure would be better some queries and not as good for others, but I believe overall it would be better and allow for other properties of fields and sheep to be added in the future.


So to limit redundant data I would set up three different trees, sheep, fields, and sightings.

sheep: {
    uniqueSheepId: {
        currentField: uniqueFieldId,
        sightings: {
            FirebaseAutoId: uniqueSightingsId,
            ...
        }
    }
}

fields: {
    uniqueFieldId: {
        numberOfSheep: number,
        sightings: {
            FirebaseAutoId: uniqueSightingsId,
            ...
        }
    }
}

sightings: {
    FirebaseAutoId: {
        type: enter,
        sheepId: uniqueSheepId,
        fieldId: uniqueFieldId,
        timestamp: xxxxxxxx.xxxxx
    }
}

retrospectively analyze each sheep's movement over a given time period

You can query the sheep's movement: sheep/uniqueSheepId/sightings

I also want to analyze each field's use over time

You can query the field's use: fields/uniqueFieldId/sightings

instantly know which field any given sheep is currently in

You can query the field for any given sheep: sheep/uniqueSheepId/currentField

instantly know which sheep are in a given field

In JavaScript you could query as so:

var ref = firebase.database().ref("sheep").orderByChild("currentField").equalTo(uniqueFieldId);

An equivalent can be done in any Firebase supported language.

All of this assumes you are updating and pushing to the database correctly in real time.