3
votes

I'm constructing a simple search form for our support ticket database using Lucene.Net, and I want to add the ability to filter, for example to filter results by ticket status. Whats the best way to do this? As far as I can see the options I have are:

  1. Include all the terms I want to filter by in my index, and filter using Lucene.Net
  2. Apply the filter after getting the search results from Lucene by going to the database to filter out results that Lucene returned that aren't valid for the filter.

Option 1 will inflate the size of my index more and more for each extra field I want to filter by. Option 2 on the other hand will inflate the size of my index, cause problems when adding new fields to filter by and also makes paging more tricky.

Is there an obvious choice here, or are combinations of both acceptable? (and is there a 3rd option that I cant see?)

1

1 Answers

2
votes

I wouldn't worry about the size of the index :-)

We go for option 1 all the time and never filter data outside of lucene.net. You could end up in a situation where you need to retrieve a LOT of hits from lucene.net before you get the number of "true" hits after filtering in the database - it could also require several roundtrips to the database.

We currently have around 100 fields in average on our 150K documents and that is working extremely well.