7
votes

I am struggling to find good material on best practices for filtering data using firebase firestore. I want to filter my data based on the categories selected by the user. I have a collection of documents stored on my firestore database and each document have an array which has all the appropriate categories for that single document. For the sake of filtering, I'm keeping a local array with a user's preferred categories as well. All I want to do is to filter the data based on the user's preferred categories.

firestore categories field

consider I have the user's preferred categories stored as an array of strings ( ["Film", "Music"] ) .I was planning on using firestore's 'array-contains' method like

db.collection(collectioname)
.where('categoriesArray', 'array-contains', ["Film", "Music"])

Later I found out that I can't use 'array-contains' against an array itself and after investigating on this issue, I decided to change my data structure as mentioned here.

categories changed to Map

Once I changed the categories from an array to map, I thought I could use multiple where conditions to filter the documents

let query = db.collection(collectionName)
      .where(somefield, '==', true)

this.props.data.filterCategories.forEach((val) => {
  query = query.where(`categories.${val}`, '==', true);
});

query = query
        .orderBy(someOtherField, "desc")
        .limit(itemsPerPage)

const snapshot = await query.get()

Now problem number 2, firebase requires to add indexes for compound queries. The categories I have saved within each document is dynamic and there's no way I can add these indexes in advance. What would be the ideal solution in such cases? Any help would be deeply appreciated.

4
What do you exactly mean when you say that you cannot make indexes in advanced?andresmijares
In my use case, the categories field in each document is different, or I can't simply define a master set of these categories beforehand. Upon creating a new document, the user will be able to choose categories that suits the current context from a list of categories. This list comes from another collection , say categories, and the documents in this collection could be different each time. means new categories might get added to this collection or existing ones might get deleted. In such case I won't be able to keep up with the whole indexing thing.nithinpp
Are you saying that Firestore rejects the query? Can you be more specific about this? Try writing your query without any loops (your current forEach loops looks like it wouldn't work - it's not actually building a query object properly).Doug Stevenson
Is this an OR query or an AND query? Do you want to fetch documents where the category is music or film or documents where the categories include music and film? And, yes, the composite index limitation is a real hurdle but should not get in the way if you properly denormalize your data.liquid
@bsod I'm looking to fetch all the documents where the categories include either music or film or both along with few other filtering conditions and a setup to paginate my data. Could you please guide me a lil more detail about how I can overcome such a limitation?nithinpp

4 Answers

5
votes

This is a new feature of Firebase JavaScript SDK launched at November 7, 2019:

"array-contains-any operator to combine up to 10 array-contains clauses on the same field with a logical OR. An array-contains-any query returns documents where the given field is an array that contains one or more of the comparison values"

citiesRef.where('regions', 'array-contains-any',
    ['west_coast', 'east_coast']);
1
votes

Instead of iterating through each category that you wish to query and appending clauses to a single query object, each iteration should be its own independent query. And you can keep the categories in an array.

<document>
    - itemId: abc123
    - categories: [film, music, television]

If you wish to perform an OR query, you would make n-loops where each loop would query for documents where array-contains that category. Then on your end, you would dedup (remove duplicates) from the results based on the item's identifier. So if you wanted to query film or music, you would make 2 loops where the first iteration queried documents where array-contains film and the second loop queried documents where array-contains music. The results would be placed into the same collection and then you would simply remove all duplicates with the same itemId.

This also does not pose a problem with the composite-index limit because categories is a static field. The real problem comes with pagination because you would need to keep a record of all fetched itemId in case a future page of results returns an item that was already fetched and this would create an O(N^2) scenario (more on big-o notation: https://rob-bell.net/2009/06/a-beginners-guide-to-big-o-notation/). And because you're deduping locally, pagination blocks as the user sees them are not guaranteed to be even. If each pagination block is set to 25 documents, for example, some pages may end up displaying 24, some 21, others 14, depending on how many duplicates were removed from each block.

0
votes

Are you planning on retrieving documents with the exact category array? Say, your user preference is listed as ["Film", "Music"]. Do you wish to retrieve only those documents with Film AND Music, or do you wish to retrieve documents having Film OR music?

If it's the latter, then maybe you can query for all documents with "Film" and then query for all documents with "Music", then merge it. However, the drawback here is some redundant document reads, when such document has both "Film" and "Music" in the categoryArray field.

You can also explore using Algolia to enable full-text search. In this case, you'd probably store the category list as a string maybe separated by commas, then update the whole string when the user changes their preferences.

For the former case, I have not come across sa workable solution other than maybe storing it as a concatenated string in alphabetical order? Others might have a more solid solution than mine.

Hope this helps!

0
votes

Your query includes an orderBy clause. This, in combination with any equality filter, requires that you create an index to support that query. There is no way to avoid this.

If you remove the orderBy, you will be able to have flexible, dynamic filters for equality using the map properties in the document. This is the only way you will be able to have a dynamic filter without creating an index. This of course means that you will have to order and page the query results on the client.