3
votes

In my Lucene.Net index, I have documents with a startDate field and an endDate field. Both fields store dates in yyyyMMdd format. How can I build a query that will return hits if today's date falls between those two dates?

startDateFieldValue < myTargetDate < endDateFieldValue

For example, if myTargetDate is 17760604, I'd want to get a document back that had a startDate field value of 10660101 and an endDate field value of 19990101.

The scenario is that I have a Lucene database with Lucene documents that represent particular building sites. Each site has a StartConstruction date and an EndConstruction date. My users will enter a specific date, and I want to find all properties that were currently under construction on that date.

Note: I'm working with Lucene.Net 1.9, a much older version, and my company can't upgrade (yet).

3
For ex: +mydatefield:[10660101 TO 19990101] +myotherfield:dthrasherL.B
Um... I don't think that query makes sense. Let me edit my question to clarify what I mean.dthrasher

3 Answers

6
votes

You can do this using a Range Query. Specifically, you can do this using a NumericRangeQuery. To do this begin by indexing your dates using a NumericField and adding them to your document like:

var df = new NumericField(Fields.AmendedDate);
df.SetIntValue(int.Parse(itemToIndex.startDate.ToString("yyyyMMdd")));
doc.Add(df);

You can make your indexing a little faster by reusing your NumericField across many documents see the documentation. With your dates all nicely indexed you are now ready to search across it. To do this we use a NumericRangeQuery:

var q = NumericRangeQuery.NewIntRange(  Fields.AmendedDate,
                                        int.Parse(SearchFrom.ToString("yyyyMMdd")),
                                        int.Parse(SearchTo.ToString("yyyyMMdd")),
                                        true, true);

This query can then be used to search or conjoined to an existing query like:

masterQuery.Add(q, BooleanClause.Occur.MUST);

Splitting your search in this way is a far faster proposition than using a textual term search due to the nature of how numeric fields are indexed. Also, your resolution (in this instance to day level) can be altered to give a better spread across your data (i.e. if you need to the hour, minute or second then add them to the string from most to least significant). The final point of this is that by using a query you ignore the filtering step of your search (it's a normal query, not a filter).

1
votes

I'm not sure I phrased my question properly. I want to find out if a particular item was active between a start and an end date. The StartDate is stored in one Lucene field, the EndDate in another.

Here's the search snippet I used:

var searchableDate = DateTools.DateToString(dateToSearchFor, DateTools.Resolution.DAY);

var lowerRange = new RangeQuery(null, new Term("StartDate", searchableDate), true);
var upperRange = new RangeQuery(new Term("EndDate", searchableDate), null, true);

var activeTodayFilter = new BooleanQuery();
activeTodayFilter.Add(new BooleanClause(lowerRange, BooleanClause.Occur.MUST));
activeTodayFilter.Add(new BooleanClause(upperRange, BooleanClause.Occur.MUST));
return activeTodayFilter;

I found the solution in an old Lucene forum/newsgroup, but I'm afraid I don't remember the link.

If there's an easier/better way to write the query above, let me know.

0
votes

You have to use a RangeQuery.

RangeQuery rq = new RangeQuery(new Term("date", "10660101"),new Term("date", "19990101") ,true);

In an up-to-date version you could use NumericFields/NumericRangeQuery for better performance.