If I define my index with this analyzer (C#):
settings = new
{
index = new
{
number_of_shards = 1,
number_of_replicas = 1,
analysis = new
{
analyzer = new
{
analyzer_standard_with_html_strip = new
{
type = "standard",
char_filter = new string[] { "html_strip" },
stopwords = "_english_"
},
What does the type field do? Does it base the analyzer on the standard analyzer? If I don't have the type line at all it seems to work. This, from https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-htmlstrip-charfilter.html, seems to suggest you don't need it:
In this example, we configure the html_strip character filter to leave tags in place:
PUT my_index { "settings": { "analysis": { "analyzer": { "my_analyzer": { "tokenizer": "keyword", "char_filter": ["my_char_filter"] } }, "char_filter": { "my_char_filter": { "type": "html_strip", "escaped_tags": ["b"] } } } } }
There, the analyzer has no type specified. Shouldn't it be "custom"?
So, what does the type field do when you're defining an analyzer? What is the difference between
"my_analyzer": {
"type": "standard",
"tokenizer": "keyword",
"char_filter": ["my_char_filter"]
}
and
"my_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"char_filter": ["my_char_filter"]
}
and
"my_analyzer": {
"tokenizer": "keyword",
"char_filter": ["my_char_filter"]
}
?