1
votes

if I have a document with this words in the content:

"dolor de cabeza" using the spanish analyzer, searching for "dolor de cabeza" returns the document ok. but using dolor de cabeza (without quotes) returns nothing.

Actually, every stop word in the search query will make it to return no documents when using queryType=Full and searchMode=All.

the problem with using the quote approach is that it will only match the exact sentence.

is there any workaround? I think this is a BUG.

1
Do you have different, language-specific analyzers set on fields you are searching over?Yahnoosh

1 Answers

4
votes

Short version:

This happens when you issue a search query with searchMode=All against fields that use analyzers that process stopwords differently. Please make sure you scope your query only to fields analyzed with the same analyzer using the searchFields search request parameter. Alternatively, you can set the same searchAnalyzer on all your searchable fields that removes stopwords from your query in the same way. To learn more about custom analyzers and how to search indexAnalyzer and searchAnalyzer independently, go here.

Long version:

Let’s take an index with two fields where one is analyzed with English Lucene analyzer, and the other with standard (default) analyzer.

{
  "fields":[
    {
      "name":"docId",
      "type":"Edm.String",
      "key":true,
      "searchable":false
    },
    {
      "name":"field1",
      "type":"Edm.String",
      "analyzer":"en.lucene"
    },
    {
      "name":"field2",
      "type":"Edm.String"
    }
  ]
}

Let’s add these two documents:

{
  "value":[
    {
      "docId":"1",
      "field1":"Waiting for a bus",
      "field2":"Exploring cosmos"
    },
    {
      "docId":"2",
      "field1":"Run to the hills",
      "field2":"run for your life"
    }
  ]
}

The following query doesn’t return any results search=wait+for&searchMode=all

It's because terms in this query are processed independently for each of the fields in the index by the analyzer defined for that field. For field1 the query becomes search=wait (‘for’ was removed as it is a stop word) For field2 it stays search=wait+for (the standard analyzer doesn’t remove stop words).

Only the first document matches ‘wait’ (in the first field), however the second field in the first document doesn’t match ‘for’, thus no results. When you set searchMode=all you tell the search engine that all query terms must be matched at least once.

For comparison, another query with a stopword search=running+for&searchMode=all returns the second document as a result. Term ‘running’ matches in field1 (it’s stemmed) and ‘for’ matches in field2.

To learn more about query processing in Azure Search read How full text search works in Azure Search