Querying against a comma separated list of IDs with Examine and Lucene.Net?

Question

I am using Examine for Umbraco (which is built on top of Lucene.net) to do my search. I am quite sure my problem is Lucene related.

One of my fields contains a list of comma separated IDs. How do I query this field in the right way?

Eg. I have a field with the values "64,65". I have tried using MultipleCharacterWildcard which only returns a result if I query for the ID 64, but not for ID 65. SingleCharacterWildcard does not return anything, and Fuzzy only returns something if there is only one ID in the field. Any ideas of how to do a proper search? I guess what I am looking for is a "Contains"-query.

Also is this the right way to handle fields with comma separated lists or would it be better to instead split the comma separated list up into individual fields?

araqnid araqnid · Accepted Answer · 2011-02-28T17:54:25

I would certainly split the list up into separate fields. You can have multiple values for the same field name in a document, which is a fairly natural way to represent a set of values:

venue_id: 12345
treatment_id_set: 1234
treatment_id_set: 2345

With documents like this, I can simply query for "treatment_id_set:1234" to find all the venues supporting that treatment. Of course, the ordering of the treatments is lost. If you need to recover it, store the comma-separated value while indexing the individual members:

# stored, indexed
venue_id: 12345
# stored, not indexed
treatment_id_list: 1234,2345
# not stored, indexed
treatment_id_set: 1234
treatment_id_set: 2345

Querying against a comma separated list of IDs with Examine and Lucene.Net?

2 Answers