0
votes

I have different fields in my Azure Cognitive Search but let me show you some with which I have problems.

{
   "name": "Name",
   "type": "Edm.String",
   "searchable": true,
   "filterable": false,
   "retrievable": true,
   "sortable": true,
   "facetable": false,
   "key": false,
   "indexAnalyzer": null,
   "searchAnalyzer": null,
   "analyzer": "standard.lucene",
   "synonymMaps": []
}

and

{
    "name": "Code",
    "type": "Edm.String",
    "searchable": true,
    "filterable": false,
    "retrievable": true,
    "sortable": false,
    "facetable": false,
    "key": false,
    "indexAnalyzer": null,
    "searchAnalyzer": null,
    "analyzer": "keyword",
    "synonymMaps": []
}

As you can see above, for Name I set analyzer standard.lucene (I have language-specific for other fields like NameEn) and keyword analyzer for Code field.

For example, when I search by 1-1 it looks for 1 instead of 1-1. I try with double quotes but it seems I also don't work ("1-1").

The issue is that as a result I get Name with the number 1 instead of Code which have 1-1.

Do you have any idea how can I do it? I suppose I should search by the whole phrase like: "1-1" rest part of the query.

1
Standard Lucene doesn't tokenize special characters, have you tried with en.microsoft analyzer ? I justed it and indeed creates a token for 1-1 as a wholeJdresc
@Joseandresc I do not always have English content there, that's why I used standard Lucene. I could not find any general analyzer from Microsoft.mskuratowski

1 Answers

1
votes

When you send query it is analyzed by analyzers of all searchable fields and then tokenized query (different for each field) will be executed against all of them.

You can send queries to analyze endpoint to debug how each analyzer is working with your query - https://serviceName.search.windows.net/indexes/index-name/analyze?api-version=2020-06-30

In your case:

{
    "text": "1-1",
    "analyzer": "standard"
}

returns these tokens for Name field

 "tokens": [
        {
            "token": "1",
            "startOffset": 0,
            "endOffset": 1,
            "position": 0
        },
        {
            "token": "1",
            "startOffset": 2,
            "endOffset": 3,
            "position": 1
        }
    ]

and for Code field

{
    "text": "1-1",
    "analyzer": "keyword"
}

you get

"tokens": [
        {
            "token": "1-1",
            "startOffset": 0,
            "endOffset": 3,
            "position": 0
        }
    ]

So with such query you are really looking for documents with

Name=1 | Code=1-1

If you want to search only in selected fields you can specify them using searchFields parameter.