I am migrating a Cloudant database without partitions to the new partition system of Cloudant to reduce the cost in my ibm cloud account. The context can be summarized like so :
- I am dealing with emails object which have a category name
- I might receive more dans 100 new entries (emails) per day
- The UI can query the emails from date A to date B and also on categories C1, C2, ... C100 in any combination possible of categories.
- The UI displays only 15 emails/page
The question is about the partitioning of such a data model and avoid as much as possible global queries (cross partitions) which are way more costly than partition based queries.
I thought first I would go for a partitioning per day but eventually I can end up with one situation where the query filters emails on a specific category Cn on 4 months but the specific category receives only 1 email per day which means that to display one page on the UI (of 15 emails) I should do 15 queries which is not acceptable.
Before the partitioning arrival, I was just doing global queries with the Lucene query engine but that is not anymore because of the cost.
Also, I also considered putting all the emails in a single partition so that I would be able to use the same old query within that partition and since it is a partition, I would not hit the global query price but the partition query price. That theoretically work but might have some limits I guess since the documentation about partitions recommends not to put "too many data" in a single partition.
Do you by any mean have any recommandation for this kind of problem ?
Thanks.