0
votes

I'm learning Apache Lucene and I've some queries regarding the performance of the index,

  1. I'm building an index based on the data in the database, the schema of the database is the schema of the Lucene Document.
  2. I have two options to respond back to searches. First is search the index and the respond the value with the index, else search the index, get the id(which is the primary key of the table) of the results and query the database. For the former I need to store the values of the table in the index so that I can retrieve the values using Index.Store.Yes, and for the latter its enough to index the data and not to store the data using Index.Store.No so the index will not be bigger.
  3. Will using the former technique will affect the performance(searching the index by storing all the values) or searching the index and querying the db (by storing all the values) will affect performance.
  4. Which is the best approach or is there are any other solutions to the above problem
2

2 Answers

1
votes

It really depends, if you are going to have a huge dataset its usually better to keep the index as light as possible and query the database as you described. However if the dataset is small store them as well...

0
votes

Performance of searching operation in Apache Lucene depends on weight of index file.Keep it as light as possible. Well in your case I recommend to have a time stamp test for both the cases. Take at least 20+ observations. Calculate an average value. Then resultant stats will definitely help you in taking a decision.