0
votes

I would like to know if it is possible to design Google Datastore indexes and queries in a way that I can reduce the number of composite indexes needed for my project.

I have been trying to use one index that works with both of my queries like the example found here. However, I cannot find a way for one index to work with my query needs, nor do I know if one even exists. Below are the two queries I have been working with and the Datastore index suggestions I receive for both. Is there something I could do differently to make these queries work with a single index?

Query 1:

CompositeFilter.and(
    CompositeFilter.and(PropertyFilter.gt("time", time1), PropertyFilter.lt("time", time2)), 
    PropertyFilter.eq("user", user1),
    PropertyFilter.eq("color", color1), 
    PropertyFilter.eq("bpm", bpm1));

addOrderBy(OrderBy.asc("time"), OrderBy.asc("location"));

Query 1 Index Suggestion:

- name: color
- name: bpm
- name: user
- name: time
- name: location

Query 2:

CompositeFilter.and(
    CompositeFilter.and(PropertyFilter.gt("time", time1), PropertyFilter.lt("time", time2)),
    PropertyFilter.eq("user", user1));

addOrderBy(OrderBy.asc("time"), OrderBy.asc("color"), OrderBy.asc("bpm"), OrderBy.asc("location"));

Query 2 Index Suggestion:

- name: user
- name: time
- name: color
- name: bpm
- name: location
1

1 Answers

1
votes

I assume "created" is actually "time" as you are using "time" in filter and "created" in ordering.

Note that any datastore query should be satisfied with a single range scan of one (or more) indexes.

In your first query, you have user, color and bpm fixed and for a given time range you want to order by the time and location. So, you need to have user, color and bpm as prefix columns followed by the time and location.

In your second query, you just have user fixed and for a given time range you want to order by time, color, bpm and location. That means, user is the only partition and then immediately the scan has to happen within a given time range. Color and BPM come after time in this case for supporting the sort order.

That's why it won't be possible to get away with a single index. However, if your data for a given user and time range you are querying is not much, you can just have an index on user and time and then do in-memory sort after getting the results. In case you want to use projection query, then you would need all the columns and your second index with in-memory filtering (for exact bpm and color) and sorting will be required.