1
votes

For learning purposes, I am aiming to create a Twitter clone using Firestore.

To start off, I think I need two collections: users and tweets. I would like to make available one main feed of all tweets by all users which can be easily done:

db.collection('tweets').get()
    .then(querySnapshot => {
    querySnapshot.forEach(tweet => {
        console.log(`${tweet.data()}`);
    })
})

What if I want to be able to query a list of tweets by a specific user (when that user's profile is viewed)?

From my understand, I have three options, but I am unsure of all the pros/cons of each approach:

Option 1: Create a subcollection of a user which will hold the tweets:

db.collection('users').doc('username_123').collection('tweets').get()

Option 2: Create a root-level collection with a suitable name which will show the hierarchy of the data:

var username = 'username_123';
db.collections('tweets__' + username).get()

Option 3: Using an equality operator query:

var username = 'username_456';
db.collection('tweets').where("username", "==", username).get()

I would like to choose an approach which will be cost-effective at scale.

2
I started firebase a couple of weeks ago, but i know: Option 1 is great if you always scan over friends posts. But inefficient, if you want to query over all tweets (you have to query over the parent (the users) also. Option 3 is great for querying over all tweets (like a feed) but not as fast as option 1 if you want to query ONLY over friends tweets (when you show their profile. Cant tell about option 2. - Markus

2 Answers

0
votes

This is a good question and I’ve been investigating it myself. I see two options, and prefer Option 1.

OPTION 1: Fast for the user feed read, moderately expensive

Each new post is written in a top-level collection called post_user, and given a unique identifier of a postID_userID. The fields in the document are:

  • text (string)
  • userID (string)
  • originalPosterID (string)
  • original (boolean true or false)

Each post is then re-written as many times as necessary for each follower of the original poster, replacing userID with that particular follower's ID, and setting original to false.

When a user opens the app, only one Firestore query is needed:

  • Query: Find all posts where userID = currentUserID within the last 20 days

Firestore query speeds are proportional to the result set, not the entire data set, so this is a very fast query.

The app assembles this data and presents it to the user in an infinite scroller. Since there is no merging of data needed, this is done very fast.

When the user has scrolled to the end of the results, the above query is repeated for the next 20 days' worth of posts and loaded (like Facebook's infinite scroller).

The number of writes on each new post (to add the posts to all followers) could make for a slow experience on the client side. So perhaps the client only creates the user post, but then once it’s created and the user gets the confirmation, have a Cloud Function run to post it to all the follower feeds. The client app would not be awaiting all of that, so it would be fast, and the followers would see the new post within a minute or so. CRON jobs could perhaps also be used for this but I’m not sure if that brings any cost savings.

OPTION 2: Slower for the user feed read, less expensive

Each new post is written only once in a top-level collection called posts, and given a unique identifier of a postID. The fields in the document are:

  • text (string)
  • userID (string)

When a user opens the app, the Firestore queries are numerous, taking longer (and the more people followed, the longer it is):

  • Query 1: Find the userID of all people the user is following (followedUserA, followedUserB, followedUserC, etc...)
  • Query 2A: Find all posts by followedUserA within last 20 days
  • Query 2B: Find all posts by followedUserB within last 20 days
  • Query 2C: Find all posts by followedUserC within last 20 days
  • Etc.... (for however many people the user is following)

Then on the client side, these results are merged. Depending on how many people the user is following, this can take a while. The app then presents the posts to the user in an infinite scroller.

When the user has scrolled to the end of the results, all of the above queries and merging of results must be repeated for the next 20 days’ worth.

0
votes

I think it's better if you choose subcollection of tweets inside user collection as you mentioned in option 1