0
votes

I try to use a lucene index on a remote server as an input for carrot2 installed on the same server. Regarding the documentation this should be possible with carrot2-dcs (documentation chapter 3.4 Carrot2 Document Clustering Server: Various document sources included. Carrot2 Document Clustering Server can fetch and cluster documents from a large number of sources, including major search engines and indexing engines (Lucene, Solr)).

After installing carrot2-dcs 3.9.3 I discovered that lucene isnĀ“t available as a document source. How to proceed?

1

1 Answers

0
votes

To cluster content from a Lucene index, the index needs to be available on the server the DCS is running (either through the local file system or e.g. as an NSF mount).

To make the Lucene source visible in the DCS:

  1. Open for editing: war/carrot2-dcs.war/WEB-INF/suites/source-lucene-attributes.xml
  2. Uncomment the configuration sections and provide the location of your Lucene index and the fields that should serve documents' titles and content (at least one is required). Remember the fields must be "stored" in Lucene speak.
  3. Make sure the edited file is packed back to the WAR archive and run the DCS. You should now see the Lucene document source.