0
votes

I currently have a terms facet that excludes certain terms from its results, using the exclude option. I tried the same with a term_stats facet but the same results are not dropped.

I have looked at the Elasticsearch documentation (http://www.elasticsearch.org/guide/reference/api/search/facets/terms-stats-facet/) and see that the term_stats facet doesn't appear to have an exclude option. Since I don't always trust my full interpretation of the elasticsearch docs I was looking to see if anybody had found a workaround (beyond processing out the results client-side).

This facet doesn't work as expected:

"keywords_bad":{
  "terms_stats":{
    "size":100,
    "value_field":"retweet_count",
    "exclude":["http","consected"],
    "order":"total",
    "key_field":"text"
  }
}

whereas this facet works as expected:

"keywords_good":{    
  "terms":{
    "size":100,
    "exclude":["http","consected"],
    "order":"count",
    "field":"text"
  }
}

Any reasonable suggestions would be appreciated, as this seems a little inconsistent.

UPDATE

Based on the accepted answer by imotov, I created an issue on GitHub at https://github.com/elasticsearch/elasticsearch/issues/2916

1
How many different terms can a record have? Can a single record have both included and excluded terms at the same time?imotov
The terms facet is effectively counting the number of documents each term appears in, so you can see which terms appear in more documents than others. If, for example, you are running a term facet on Tweets, you'd see http and your screen name appear loads. 'exclude' allows those terms to be filtered out, while not excluding the whole document from finding other terms (as a query / filter would). 'term_stats' should be doing the same, but rather than counting 1 for each document, it should total the value the value in the value_field. So I'd like to exclude unnecessary terms in the same way.Phil
Or to answer your question imotov, all terms are included unless explicitly excluded (for the terms facet). And there appears to be no way to exclude terms in the terms_stats facet, making a whole load of junk eligible to be counted.Phil
I understand. I was just trying to figure out if facet_filter could be used as a work around. Apparently not. terms_stats don't support exclude at the moment. I would suggest creating an issue on github asking for it to be added.imotov
imotov - that sounds like the answer. Feel free to post it as such.Phil

1 Answers

0
votes

terms_stats facet doesn't support exclude at the moment. I would suggest creating an issue on github asking for it to be added.