Increasing the weight of particular terms (e.g. headings) when indexing documents in Lucene

Question

I have documents which I am indexing with Lucene. These documents basically have a title (text) and body (text). Currently I am creating an index out of Lucene Documents with (amongst other fields) a single searchable field, which is basically title+" "+body. In this way, if you search for anything which occurs in the title or in the body, you will find the document.

However, now I have learned of the new requirement that matches in the title should cause the document to be "more relevant" than matches in the body. Thus, if there is a document with the title "Software design", and the user searches for "Software design", then that document should be placed higher up in the search results than a document called something else, which mentions software design a lot in the body.

I don't really have any idea how to begin implementing this requirement. I know that Google e.g. treats certain parts of the document as "more relevant" (e.g. text within <h1> tags), everyone here assumes Lucene supports something similar.

However,

The Javadoc for the Document class clearly states that fields contain text, i.e. not structured text where some parts are "more important" than other parts.
This blog post states "With Lucene, it is impossible to increase or decrease the weight of individual terms in a document."

I'm not really sure where to look. What would you suggest?

Any specific information (e.g. links to Lucene documentation) stating flatly that such a thing is not possible would also be helpful, then I needn't spend any further time looking for how to do it. (The software is already written with Lucene, so we won't re-write it now, so if Lucene doesn't support it, then there's nothing anyone (my boss) can do about that.)

Persimmonium Persimmonium · Accepted Answer · 2011-02-16T14:31:28

Just use two fields, title and body, and while indexing boost 'title' field:

title.setBoost(float)

see here

Increasing the weight of particular terms (e.g. headings) when indexing documents in Lucene

2 Answers