I am using ElasticSearch and Lucene with the standard analyzer. I want to make my index not return results for "Paleontology" when the query is "Paleo". I do however want it to return results for "Paleolithitic" which is related to "Paleo". In other words, I want the analyzer to be more intelligent, and to filter out stems that are not related to the keyword, while keeping the stems that are related to it. What solutions do I have available?
0
votes
1 Answers
0
votes
Implement your own stemming filter (or extend an existing one). The standard analyzer doesn't use stemming, so I'm not sure which exact stemmer you're using. Though, here is the PorterStemmer in Lucene.
If this seems too complex, you could put a StopWord filter after you're stemmer and just reject the token you want.
StemFilter
, or what? Also, are you intending to manually define rules like this? – femtoRgon