I cant make the following work in Atlas using $search pipeline.
Problem
- If we search below with query = "John" only document with "John" are returned
- if we search with "John Doe" then we have way too much document : returned document are the one with John OR Doe.
We need to be able to search in field index with query like 'John Doe' ond only get document containing 'John Doe' in entity field 'index'.
I have lot of mongo entities with following model (with lots of different names than john doe), here is one of these entity:
{
"_id" : "1b85cbe3-d0f4-44ee-a9fd-f9b81152891d",
"aList" : [
{
"index" : [
"John Doe 10001 New York",
"Jane Doe 10001 New York"
]
}
],
"anotherList" : [
{
"index" : [
"John Doe 10001 New York",
"John Doe 10001 New York"
],
}
]
}
I have create a lucene index in atlas with the following json
{
"mappings": {
"dynamic": false,
"fields": {
"anotherList": {
"fields": {
"index": [
{
"dynamic": true,
"type": "document"
},
{
"multi": {
"frenchAnalyzer": {
"analyzer": "lucene.french",
"searchAnalyzer": "lucene.french",
"type": "string"
},
"germanAnalyzer": {
"analyzer": "lucene.german",
"searchAnalyzer": "lucene.german",
"type": "string"
},
"italianAnalyzer": {
"analyzer": "lucene.italian",
"searchAnalyzer": "lucene.italian",
"type": "string"
}
},
"type": "string"
}
]
},
"type": "document"
},
"aList": {
"fields": {
"index": [
{
"dynamic": true,
"type": "document"
},
{
"multi": {
"frenchAnalyzer": {
"analyzer": "lucene.french",
"searchAnalyzer": "lucene.french",
"type": "string"
},
"germanAnalyzer": {
"analyzer": "lucene.german",
"searchAnalyzer": "lucene.german",
"type": "string"
},
"italianAnalyzer": {
"analyzer": "lucene.italian",
"searchAnalyzer": "lucene.italian",
"type": "string"
}
},
"type": "string"
}
]
},
"type": "document"
}
}
}
}
Now when I run an aggregate searching for John, I get
{
"index": "IndexKundensuche",
"text": {
"query": "John",
"path": [
{
"value": "aList.index",
"multi": "frenchAnalyzer"
},
{
"value": "aList.index",
"multi": "germanAnalyzer"
},
{
"value": "aList.index",
"multi": "italianAnalyzer"
},
{
"value": "anotherList.index",
"multi": "frenchAnalyzer"
},
{
"value": "anotherList.index",
"multi": "germanAnalyzer"
},
{
"value": "anotherList.index",
"multi": "italianAnalyzer"
}
]
}
}
only document containing "John" back. if we search with query = "John Doe" then we have way too much document : returned documents are the one with "John" OR "Doe" and not ordered.
The Sorting is already present, but we do not know how the Score is calculated, because e.g. if we search with postal code Score is rated partially higher but the first document is not the one we expect.
if you have a middle name for example (Jon-Ben Doe) and search for Jon Doe, other results come up with Doe before Jon....
What am I doing wrong? is it supported by Atlas $search? is my $search query wrong? Are we forced to split the $search pipeline (search for John then Doe then ....).
Thanks