I have ran into a problem with elasticsearch highlighting. I am using elasticsearch-web plugin from Rivetlogic to integrate Elasticsearch into Liferay portal. It works just fine, but when I use highlighter on some documents, the highlighted words are wrong. These problem doesn't seem to be connected with Rivetlogic plugin itself, I was able to simulate it through Sense addon with plain elasticsearch query as well.
An example query:
POST /liferay_company_20155/com_liferay_portlet_documentlibrary_model_DLFileEntry/_search
{
"query": {
"query_string": {
"query": "+(+(companyId:20155) +((+(entryClassName:com.liferay.portal.model.User) +(status:0)) (+(entryClassName:com.liferay.portlet.bookmarks.model.BookmarksEntry) +(status:0)) (+(entryClassName:com.liferay.portlet.bookmarks.model.BookmarksFolder) +(status:0)) (+(entryClassName:com.liferay.portlet.blogs.model.BlogsEntry) +(status:0)) (+(entryClassName:com.liferay.portlet.documentlibrary.model.DLFileEntry) +(status:0) +(hidden:false)) (+(entryClassName:com.liferay.portlet.documentlibrary.model.DLFolder) +(status:0) +(hidden:false)) (+(entryClassName:com.liferay.portlet.journal.model.JournalArticle) +(status:0) +(head:true)) (+(entryClassName:com.liferay.portlet.journal.model.JournalFolder) +(status:0)) (+(entryClassName:com.liferay.portlet.messageboards.model.MBMessage) +(status:0) +(discussion:false)) (+(entryClassName:com.liferay.portlet.wiki.model.WikiPage) +(status:0)))) +(assetCategoryTitles:*zkouska* assetCategoryTitles_cs_CZ:*zkouska* assetTagNames:*zkouska* comments:zkouska content:zkouska description:zkouska properties:zkouska title:zkouska url:zkouska userName:*zkouska* -stagingGroup:true city:zkouska country:zkouska emailAddress:*zkouska* firstName:zkouska fullName:zkouska lastName:zkouska middleName:zkouska region:zkouska screenName:zkouska street:zkouska zip:zkouska ddmContent:zkouska extension:zkouska fileEntryTypeId:zkouska path:*zkouska* classPK:zkouska content_cs_CZ:zkouska description_cs_CZ:zkouska entryClassPK:zkouska title_cs_CZ:zkouska type:zkouska articleId:zkouska)"
}
},
"highlight": {
"pre_tags" : ["<tag1>"],
"post_tags" : ["</tag1>"],
"fields": {
"content": {}
}
}
}
The result highlight looks like this:
"highlight":
{
"content": [
" logické\n1 nebo <tag1>0</tag1> (<tag1>true</tag1> nebo <tag1>false</tag1>).\n\nfunction ALTERNATIV(P:real): Boolean;\nvar X: real;\nbegin\n\nX",
" pouze na změnu <tag1>FALSE</tag1> na <tag1>TRUE</tag1>, případně na\npřekročení mezní hodnoty směrem nahoru). Protože C",
" (metoda Test), a to buď z hodnoty nula (<tag1>FALSE</tag1>) na\nhodnotu různou od nuly (<tag1>TRUE</tag1>), nebo obráceně",
"\n\ndetekci změny pouze z hodnoty <tag1>FALSE</tag1> na hodnotu <tag1>TRUE</tag1>, DetectDOWN detekuje opačnou\nzměnu. DetectALL",
" článek. Nechť X(ui) = x, Y (ui) = y, T =\n〈<tag1>0</tag1>,∞), I = O = R, I je vstupní abeceda, O je výstupní abeceda"
]
}
Notice, that all the values "FALSE", "TRUE" and "0" are highlighted. The queried word "zkouska" is also highlighted.
Is there any problem with the query?
Any help is appriciated.