4
votes

I am experiencing extremely slow performance of Google Cloud Datastore queries.

My entity structure is very simple:

calendarId, levelId, levelName, levelValue

And there are only about 1400 records and yet the query takes 500ms-1.2 sec to give back the data. Another query on a different entity also takes 300-400 ms just for 313 records.

I am wondering what might be causing such delay. Can anyone please give some pointers regarding how to debug this issue or what factors to inspect?

Thanks.

2
Have you set the chunk size on your query? - Andrei Volgin
Can you share some code showing how you go about fetching the data? We might be able to help you optimize those queries. - TheAddonDepot
where is the code ? - Raghvendra Kumar
@AndreiVolgin That is a good point. We are using an API (on top of datastore APIs) which doesn't use prepared queries. This is something we can probably check, using prepared queries with chunk/pre-fetch size specified. However only concern I have is that it is a single query executed only once so probably these options too won't affect much. Thanks for your input. - Abhishek
Default chunk size is 10, you can go up to 500. That's a big difference in a number of fetches. - Andrei Volgin

2 Answers

0
votes

You are experiencing expected behavior. You shouldn't need to get that many entities when presenting a page to user. Gmail doesn't show you 1000 emails, it shows you 25-100 based on your settings. You should fetch a smaller number (e.g., the first 100) and implement some kind of paging to allow users to see other entities.

If this is backend processing, then you will simply need that much time to process entities, and you'll need to take that into account.

Note that you generally want to fetch your entities in large batches, and not one by one, but I assume you are already doing that based on the numbers in your question.

0
votes

Not sure if this will help but you could try packing more data into a single entity by using embedded entities. Embedded entities are not true entities, they are just properties that allow for nested data. So instead of having 4 properties per entity, create an array property on the entity that stores a list of embedded entities each with those 4 properties. The max size an entity can have is 1MB, so you'll want to pack the array to get as close to that 1MB limit as possible.

This will lower the number of true entities and I suspect this will also reduce overall fetch time.