In my ElasticSearch dataset we have unique IDs that are separated with a period. A sample number might look like c.123.5432
Using an nGram I'd like to be able to search for: c.123.54
This doesn't return any results. I believe the tokenizer is splitting on the period. To account for this I added "punctuation" to the token_chars, but there's no change in results. My analyzer/tokenizer is below.
I've also tried: "token_chars": [] <--Per the documentation this should keep all characters.
"settings" : {
"index" : {
"analysis" : {
"analyzer" : {
"my_ngram_analyzer" : {
"tokenizer" : "my_ngram_tokenizer"
}
},
"tokenizer" : {
"my_ngram_tokenizer" : {
"type" : "nGram",
"min_gram" : "1",
"max_gram" : "10",
"token_chars": [ "letter", "digit", "whitespace", "punctuation", "symbol" ]
}
}
}
}
},
Edit(More info): This is the mapping of the relevant field:
"ProjectID":{"type":"string","store":"yes", "copy_to" : "meta_data"},
And this is the field I'm copying it into(that also has the ngram analyzer):
"meta_data" : { "type" : "string", "store":"yes", "index_analyzer": "my_ngram_analyzer"}
This is the command I'm using in sense to see if my search worked (see that it's searching the "meta_data" field):
GET /_search?pretty=true
{
"query": {
"match": {
"meta_data": "c.123.54"
}
}
}