I'm a bit stumped about how to add facets to an already existing Lucene index.
I have a Lucene index (created without any facets) created using Lucene 3.1.
I've looked over the Lucene documentation for facets, and there they show you how to create from scratch an index with facets, i.e. you create a new Lucene Document
object, use the taxonomy tools to add facet information to it (categories) and then write that document in the Lucene index (using IndexWriter
) and this will also add extra data to the taxonomy index (via TaxonomyWriter
), as described here:
However, what I want is to use the data already stored in the existing Lucene index, and from it create a new Lucene index, (with taxonomy index alongside it) that will contain the exact same data as the original index, plus the various category information.
My question is more precisely:
Is it enough to read a document from the original index, create its CategoryPath, and then write it to the new index, like this:
//get a document from original Lucene index:
Query query = queryParser.parse("*:*");
originalTopDocs = originalIndexSearcher.search(query,100);
Document originalDocument = originalIndexSearcher.doc(originalTopDocs.scoreDocs[1].doc)
//create categories for original document
CategoryDocumentBuilder categoryDocBuilder = new CategoryDocumentBuilder(taxonomyWriter);
categoryDocBuilder.setCategoryPaths(categoriesPaths);
//create new document from original document + categories:
Document originalDocumentWithCategories = categoryDocBuilder.build(originalDocument);
//write new document to new index:
newIndexWriter.write(originalDocumentWithCategories);
Does the above code index the same document as it was stored in the original index, but with added categories data? For example, will the data for the non-stored fields from the original document still be present in the newly created and indexed document?
Also is there a better way to do this update (maybe not create a new index)...