3
votes

I am using Examine for Umbraco (which is built on top of Lucene.net) to do my search. I am quite sure my problem is Lucene related.

One of my fields contains a list of comma separated IDs. How do I query this field in the right way?

Eg. I have a field with the values "64,65". I have tried using MultipleCharacterWildcard which only returns a result if I query for the ID 64, but not for ID 65. SingleCharacterWildcard does not return anything, and Fuzzy only returns something if there is only one ID in the field. Any ideas of how to do a proper search? I guess what I am looking for is a "Contains"-query.

Also is this the right way to handle fields with comma separated lists or would it be better to instead split the comma separated list up into individual fields?

2

2 Answers

4
votes

I would certainly split the list up into separate fields. You can have multiple values for the same field name in a document, which is a fairly natural way to represent a set of values:

venue_id: 12345
treatment_id_set: 1234
treatment_id_set: 2345

With documents like this, I can simply query for "treatment_id_set:1234" to find all the venues supporting that treatment. Of course, the ordering of the treatments is lost. If you need to recover it, store the comma-separated value while indexing the individual members:

# stored, indexed
venue_id: 12345
# stored, not indexed
treatment_id_list: 1234,2345
# not stored, indexed
treatment_id_set: 1234
treatment_id_set: 2345
0
votes

In order to add duplicate fields with the same key value into Lucene using Umbraco Examine, you need to hook on to the 'Document Writing' event.

_index.DocumentWriting += _index_DocumentWriting;

This will then expose the underlying Lucene document.

Fields can then be added like this:

foreach (var item in someList)
                {
                    e.Document.Add(new Field("fieldName", item, Field.Store.YES, Field.Index.NOT_ANALYZED));
                }