2
votes

I know this question has been asked before but the answers I found did not address my issue.

we are using azure search for both scenarios where we need to analyze text and for scenarios where only a filter is applied to static data. i.e if you have a product details stub in azure search with just category and price, users can find particular categories in particular price ranges without needing to do any text analysis. For such scenarios, is DocumentDB faster or Azure Search.

If faceting is required, does DocumentDB also offer that facility (either Faceting or group by) that is as efficient as Azure Search Faceting?

Is there any way of accessing the unique terms indexed in Azure Search. For example if London, England is indexed as London and England, for us to look up the index and get the unique terms London and England.

In addition, can we append any metadata to terms so that we can for example identify that the term London is a city while England is the name of a country.

Many thanks

1

1 Answers

3
votes

Generally speaking, if you only need to formulate structured queries Cosmos DB (the evolution of DocumentDB) is a natural choice. It has more flexibility around consistency models (which will make it easier for your application if you want, for example, to read your own writes immediately), requires no schema, has more flexible queries, etc. (that is, it's a general purpose database instead of a domain-specific one like the search engine)

If you need both structured and search-style queries, then Azure Search can often handle both. You could use Azure Search for both aspects of this workload if your application can tolerate the weaker consistency model of Azure Search and you don't need any of the queries that can be expressed by Cosmos DB that cannot be expressed in Azure Search.

If faceting is required, does DocumentDB also offer that facility

Cosmos DB does support aggregates, you can read more about this here: https://azure.microsoft.com/en-us/blog/planet-scale-aggregates-with-azure-documentdb/?v=17.23h

Unless you have huge data/query volumes, performance is more of a function of how much capacity you provision than something inherent only to which service you choose. That said, Cosmos DB offers strict SLA on read and write latency.

Is there any way of accessing the unique terms indexed in Azure Search

This seems like a different topic, perhaps start another question if you need more specifics. Short version is that there's no way to access the actual terms dictionary. If you need to know which set of terms exist, sometimes you can accomplish this by storing all terms you care about (perhaps a subset of the total) in a Collection(String) field in each document, and then facet over that.