2
votes

Azure Search supports highlighting with full text search which facilitates clients to locate the matched term in a returned document. I have provided a simple index schema below to illustrate the issue.

{
"name": "simple-index", 
"fields": [
    {
        "name": "key",
        "type": "Edm.String"
    },
    {
        "name": "simplefield",
        "type": "Edm.String"
    }
],
"scoringProfiles": [
    {
        "name": "boostedprofile",
        "functionAggregation": null,
        "text": {
            "weights": {
                "simplefield": 5,
            }
        },
        "functions": []
    }
],
"corsOptions": null,
"suggesters": [],
"analyzers": [],
"tokenizers": [],
"tokenFilters": [],
"charFilters": []
}

For a normal search query like below, it works as expected and gives back the expected result.

search=foobar&highlight=simplefield

On extending the above query to use a wildcard query, things are again as expected with the response containing highlights on the terms matching the prefix. So far so good.

search=foo*&highlight=simplefield&querytype=full

After this when I apply a scoring profile on top of the previous query, the results are unexpected and no highlights are returned.

search=foo*&highlight=simplefield&querytype=full&scoringprofile=boostedprofile

How do I make highlights work for the wildcard queries when using a scoring profiles?

1

1 Answers

3
votes

At the time of answering, this is a known limitation in Azure Search where highlighting doesn't work for wildcard queries when used with scoring profiles. Internally Azure Search uses a concept of highlighter which is responsible for the highlighting flow as a separate process that happens after search.

In the case of wildcard query, it involves looking up all terms in the index that match the provided prefix term and then use them to compose the highlighted text. Scoring profiles affect the way terms are looked up in index for highlighting. Due to that the result doesn't include any highlights.

As this is a specific limitation in wildcard queries, one workaround is to pre-process the index to avoid issuing wildcard/prefix queries. Please take a look at custom analysis (https://docs.microsoft.com/en-us/rest/api/searchservice/custom-analyzers-in-azure-search) You can, for example, use edgeNgram tokenfilter and store prefixes of words in the index and issue a regular term query with the prefix (with out the '*' operator)

I hope this is useful. Please vote on the feedback item to help us prioritize our development efforts to support other modes of highlighting that will support the above use-case. https://feedback.azure.com/forums/263029-azure-search/suggestions/32661961-implement-other-highlighters