0
votes

I am using Lucene to search a contacts database. By contacts I mean a name, multiple phone numbers, emails, addresses, etc etc. In the database these are obviously split into separate fields and even separate tables. I want to be able to search for contacts based on any of the fields so for instance I could type "John Doe" and Lucene would return John Doe's contact info. John Doe also has a phone number and i'd like to be able to find his record by entering that phone number, or address, or email etc etc. I don't want to have to specifically state what field I am searching for.

When creating my index, is it best to merge all the data into a single "data" field, or keep them separate? I will not be storing data in the index except for an id which I will use to retrieve all additional data from the database. Will the Standard Analyzer and Query Parser work well in my situation or should I take more of a custom approach?

I am fairly new to Lucene and am just learning how powerful it really is, so i'm not against really getting into it and creating some complicated custom search queries, but I will need some direction on doing so and want to avoid having to do all of that if its not at all necessary.

2

2 Answers

1
votes

Using a single search field is the most efficient solution. This will make you index smaller and faster to search. Even if you stored fields, you could still have one single aggregated indexed (but not stored) field for search, and one stored (but not indexed) field for every contact information.

The standard analyzer and query parser will help you build quickly a prototype, but you may need to use a custom analyzer to improve your application, for example if you want:

  • queries to give the same results independently of the diacritical marks (ASCIIFoldingFilter),
  • to handle spaces in phone numbers (so that a query for "0532" matches "0532" as well as "05 32").
1
votes

You need not create a single combined field, keeping them separate is probably a better design decision. Think down the line, you might want to do specialized searching.

You can use a MultiFieldQuery to search all the fields such as Name,Address, City...