Case insensitive search in arrays for CosmosDB / DocumentDB

Question

Lets say I have these documents in my CosmosDB. (DocumentDB API, .NET SDK)

{
    // partition key of the collection
    "userId" : "0000-0000-0000-0000",
    "emailAddresses": [
        "[email protected]", "[email protected]"
    ]
    // some more fields
}

I now need to find out if I have a document for a given email address. However, I need the query to be case insensitive.

There are ways to search case insensitive on a field (they do a full scan however):

How to do a Case Insensitive search on Azure DocumentDb?

select * from json j where LOWER(j.name) = 'timbaktu'
e => e.Id.ToLower() == key.ToLower()

These do not work for arrays. Is there an alternative way? A user defined function looks like it could help.

I am mainly looking for a temporary low-effort solution to support the scenario (I have multiple collections like this). I probably need to switch to a data structure like this at some point:

{
    "userId" : "0000-0000-0000-0000",
    // Option A
    "emailAddresses": [
        {
            "displayName": "[email protected]",
            "normalizedName" : "[email protected]"
        },
        {
            "displayName": "[email protected]",
            "normalizedName" : "[email protected]"
        }
    ],
    // Option B
    "emailAddressesNormalized": {
        "[email protected]", "[email protected]"
    }
}

Unfortunately, my production database already contains documents that would need to be updated to support the new structure. My production collections contain only 100s of these items, so I am even tempted to just get all items and do the comparison in memory on the client.

You have already given yourself 3 usable solutions: If you can change the schema, update those 100 documents. If not, scan those 100s of docs. What else do you expect in answers? — Imre Pühvel
I can't scan for array properties(server-side). Thats what the question is about. — Alex AIT

Imre Pühvel Imre Pühvel · Accepted Answer · 2017-12-04T16:07:54

If performance matters then you should consider one of the normalization solution you have proposed yourself in question. Then you could index the normalized field and get results without doing a full scan.

If for some reason you really don't want to retouch the documents then perhaps the feature you are missing is simple join?

Example query which will do case-insensitive search from within array with a scan:

SELECT c FROM c
join email in c.emailAddresses
where lower(email) = lower('[email protected]')

You can find more examples about joining from Getting started with SQL commands in Cosmos DB.

Note that where-criteria in given example cannot use an index, so consider using it only along another more selective (indexed) criteria.

Case insensitive search in arrays for CosmosDB / DocumentDB

1 Answers