3
votes

I am very new to lucene.I have a text file containing 100s of records with two columns per line.First column is of userid and second is of url_list(I guess those will be my document fields)

I need to provide a search feature using lucene which will give the document containing entered url or userid. And for that i need to create one lucene document per line of my text file.

Please suggest me some sample code for this..

I m using lucene version 3.6.2

1

1 Answers

2
votes

Here is a short but fantastic tutorial on Lucene for starters.

Lucene in 5 minutes


Steps

1) I assume that you are pre-parsing the text file to get hold of userid, corresponding url list. You've got to do this. Lucene won't help. Lucene does break the text that belongs to a single field, but won't break the text and add userid to userid field and urls to URL field.

2) Read the above tutorial. I highly recommend you to use the latest version of Lucene which is 4.1 as of now.

3) Things to remember that are specific to your use-case

  • Have two fields for each document: USER_ID, URL (of course you may change those names)

  • Do not ANALYZE (break into tokens) the content of USER_ID field.

  • I am not sure how you wanna store the URL field. You may not want to ANALYZE it or use the StandardAnalyzer which recognizes a URL without tokenizing.

4) You can find the sample code to index, query, search, retrieve results in the tutorial.