0
votes

We are doing some work to have automatic indexing from a CosmosDb collection. This collection is partitioned in order to be able to increase RUs without limits.

We want to create one Azure Search index per CosmosDb partition, however reading the partition key feed from CosmosDb returns the Partition keys (e.g. '0', '1', etc.) and not the actual value from which the collection is partitioned (in our case cultures 'en-US', 'fr-FR', etc.) which means that when creating the data source on Azure Search, it is difficult to programatically create data sources, index and indexers on the fly.

We are using the container query to filter through some of the documents (not all need to be indexed) to index but is there a way to specify the CosmosDb Partition key or to specify it in the SQL query in another way than to try to filter on the Partition key field (in our case '/Culture')?

1

1 Answers

0
votes

The feed you are reading from appears to be returning the partition key ranges instead of the defined partition key values. If you didn't know, Cosmos DB maps multiple logical partition key values to a single physical partition key range under the hood in order to best make use of your storage. Some documentation on that is here.

Azure Search does not currently have anyway to filter on a logical or physical partition outside of adding the partition key filter on the query itself. However, you should be able to programatically create this query per data source/indexer as you desire by using the following query to obtain all of the distinct partition key values from your Cosmos DB collection, instead of using the partition key range feed:

SELECT DISTINCT c.Culture FROM c

and then loop all over the results to generate the following query for each key value:

SELECT * FROM c WHERE c.Culture==partition key value