1
votes

We are trying to use Azure Cognitive Search to enable full-text search for the documents stored in Azure Blob Storage. One of the features that we need is to show the hit highlights for a particular document.

We've noticed that while the search for an exact phrase correctly matches only those documents that contain this exact phrase, the highlights are returned for the individual words in the phrase, instead of the full phrase.

Example

For the phrase search "supply agreement" we get highlights for "supply" and "agreement".

Request:

{
    "search": "\"supply agreement\"",
    "select": "metadata_storage_name,metadata_storage_path,language",
    "searchFields": "merged_content",
    "highlight": "merged_content"
}

Response:

{
    "@odata.context": "https://....search.windows.net/indexes('...')/$metadata#docs(*)",
    "value": [
        {
            "@search.score": 0.047654618,
            "@search.highlights": {
                "merged_content": [
                    "Customer has agreed to engage Supplier to <em>supply</em> the Products and Supplier has agreed to accept the engagement on the terms set out in this <em>Agreement</em>.",
                    "<em>Agreement</em>\n1.",
                    "Tax means goods and services, value added or similar consumption based tax applicable to the <em>supply</em> of the Products under this <em>agreement</em>.",
                    ...
                ]
            },
            "metadata_storage_name": "a2b23e30-c1e0-4c52-a659-d8705662d699.docx",
            "metadata_storage_path": "...",
            "language": "en"
        },
        ...
    ]
}

Is this a known issue of the current version of Azure Cognitive Search API?

1

1 Answers

2
votes

Currently there is no way to do highlight the whole phrase, but I have good news for you. The work to highlight phrases is one that we are tracking and plan to release, although I don't have a specific date to announce just yet.

Luis Cabrera - Principal Program Manager - Azure Cognitive Search