I'm trying to get "significant terms" for a subset of documents in Solr. This may or may not be the best way, but I'm currently attempting to use Solr's TF-IDF functionality since we have the data stored in Solr and it's lightning fast. I want to restrict the "DF" count to a subset of my documents, through a search or a filter. I tried this, where I'm searching for "apple" in the name field:
and that of course, only gives me documents that have "apple" in the name, but my document frequency gives the counts from the entire dataset, which doesn't seem like what I want. I would think Solr can do this, but maybe not. I'm open to suggestions.
Thanks, Adrian