0
votes

I implement a faceted search using Lucene. I have an index of documents and an index of a taxonomy. Then I collect facets for a given level of the taxonomy.

My question is: How can I get the number of documents indexed in a given Category of the Taxonomy?

I think that my question is quite simple but I couldn't find any method in the Lucene's API nor searching in Google. I only found how to get the number of documents in the whole index using the numDocs() method of the IndexReader class.

2

2 Answers

1
votes

If you have one term for each category in the index, perhaps you can use something like TermEnum.docFreq()? You can get the TermEnum object from IndexReader.terms(Term).

0
votes

I don't really know enough about your index structure to suggest the correct query for you, but if you execute a query searching for all the documents in your category, then the returned set of results will generally have a count of the total number of hits for the query.

For instance, if you query using either of:

search(Query query, int n)
search(Query query, Filter filter, int n) 

Then you will get a TopDocs object back, from which you can get the total number of hits back from: TopDocs.totalHits.