You have an analyzer configuration issue.
Let me explain that. When you defined your index in ElasticSearch, you didn't indicate any analyzer for the field. It means it's the Standard Analyzer that will apply.
According to the documentation :
Standard Analyzer
The standard analyzer is the default analyzer which is used if none is
specified. It provides grammar based tokenization (based on the
Unicode Text Segmentation algorithm, as specified in Unicode Standard
Annex #29) and works well for most languages.
Also, to answer to your question :
Why? Is minus a special character what I understand? It symbolizes
"exclude"?
For the Standard Analyzer, yes it is. It doesn't mean "exclude" but it is a special char that will be deleted after analysis.
From documentation :
Why doesn’t the term query match my document?
[...] There are many ways to analyze text: the default standard
analyzer drops most punctuation, breaks up text into individual words,
and lower cases them. For instance, the standard analyzer would turn
the string “Quick Brown Fox!” into the terms [quick, brown, fox].
[...]
Example :
If you have the following text :
"The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
Then the Standard Analyzer will produce :
[ the, 2, quick, brown, foxes, jumped, over, the, lazy, dog's, bone ]
If you don't want to use the analyzer you have 2 solutions :
- You can use match query.
- You can ask ElasticSearch to not analyze the field when you create your index : here's how
I hope this will help you.
termquery on the.keywordsubfield:"term": {"field_name.keyword": "Mark-Whalberg"}- Andrei Stefan.keywordsubfield :-) to keep the dash sign and the uppercase-lowercase text. - Andrei Stefan