0
votes

I have ran into a problem with elasticsearch highlighting. I am using elasticsearch-web plugin from Rivetlogic to integrate Elasticsearch into Liferay portal. It works just fine, but when I use highlighter on some documents, the highlighted words are wrong. These problem doesn't seem to be connected with Rivetlogic plugin itself, I was able to simulate it through Sense addon with plain elasticsearch query as well.

An example query:

POST /liferay_company_20155/com_liferay_portlet_documentlibrary_model_DLFileEntry/_search
{
   "query": {
      "query_string": {
         "query": "+(+(companyId:20155) +((+(entryClassName:com.liferay.portal.model.User) +(status:0)) (+(entryClassName:com.liferay.portlet.bookmarks.model.BookmarksEntry) +(status:0)) (+(entryClassName:com.liferay.portlet.bookmarks.model.BookmarksFolder) +(status:0)) (+(entryClassName:com.liferay.portlet.blogs.model.BlogsEntry) +(status:0)) (+(entryClassName:com.liferay.portlet.documentlibrary.model.DLFileEntry) +(status:0) +(hidden:false)) (+(entryClassName:com.liferay.portlet.documentlibrary.model.DLFolder) +(status:0) +(hidden:false)) (+(entryClassName:com.liferay.portlet.journal.model.JournalArticle) +(status:0) +(head:true)) (+(entryClassName:com.liferay.portlet.journal.model.JournalFolder) +(status:0)) (+(entryClassName:com.liferay.portlet.messageboards.model.MBMessage) +(status:0) +(discussion:false)) (+(entryClassName:com.liferay.portlet.wiki.model.WikiPage) +(status:0)))) +(assetCategoryTitles:*zkouska* assetCategoryTitles_cs_CZ:*zkouska* assetTagNames:*zkouska* comments:zkouska content:zkouska description:zkouska properties:zkouska title:zkouska url:zkouska userName:*zkouska* -stagingGroup:true city:zkouska country:zkouska emailAddress:*zkouska* firstName:zkouska fullName:zkouska lastName:zkouska middleName:zkouska region:zkouska screenName:zkouska street:zkouska zip:zkouska ddmContent:zkouska extension:zkouska fileEntryTypeId:zkouska path:*zkouska* classPK:zkouska content_cs_CZ:zkouska description_cs_CZ:zkouska entryClassPK:zkouska title_cs_CZ:zkouska type:zkouska articleId:zkouska)"
      }
   },
      "highlight": {
          "pre_tags" : ["<tag1>"],
        "post_tags" : ["</tag1>"],
         "fields": {
            "content": {}
         }
      }
}

The result highlight looks like this:

"highlight": 
{
"content": [
  " logické\n1 nebo <tag1>0</tag1> (<tag1>true</tag1> nebo <tag1>false</tag1>).\n\nfunction ALTERNATIV(P:real): Boolean;\nvar X: real;\nbegin\n\nX",
  " pouze na změnu <tag1>FALSE</tag1> na <tag1>TRUE</tag1>, případně na\npřekročení mezní hodnoty směrem nahoru). Protože C",
  " (metoda Test), a to buď z hodnoty nula (<tag1>FALSE</tag1>) na\nhodnotu různou od nuly (<tag1>TRUE</tag1>), nebo obráceně",
  "\n\ndetekci změny pouze z hodnoty <tag1>FALSE</tag1> na hodnotu <tag1>TRUE</tag1>, DetectDOWN detekuje opačnou\nzměnu. DetectALL",
  " článek. Nechť X(ui) = x, Y (ui) = y, T =\n〈<tag1>0</tag1>,∞), I = O = R, I je vstupní abeceda, O je výstupní abeceda"
]
}

Notice, that all the values "FALSE", "TRUE" and "0" are highlighted. The queried word "zkouska" is also highlighted.

Is there any problem with the query?

Any help is appriciated.

1

1 Answers

1
votes

Setting require_field_match to true should fix this.

Example:

POST /liferay_company_20155/com_liferay_portlet_documentlibrary_model_DLFileEntry/_search
{
   "query": {
      "query_string": {
         "query": "+(+(companyId:20155) +((+(entryClassName:com.liferay.portal.model.User) +(status:0)) (+(entryClassName:com.liferay.portlet.bookmarks.model.BookmarksEntry) +(status:0)) (+(entryClassName:com.liferay.portlet.bookmarks.model.BookmarksFolder) +(status:0)) (+(entryClassName:com.liferay.portlet.blogs.model.BlogsEntry) +(status:0)) (+(entryClassName:com.liferay.portlet.documentlibrary.model.DLFileEntry) +(status:0) +(hidden:false)) (+(entryClassName:com.liferay.portlet.documentlibrary.model.DLFolder) +(status:0) +(hidden:false)) (+(entryClassName:com.liferay.portlet.journal.model.JournalArticle) +(status:0) +(head:true)) (+(entryClassName:com.liferay.portlet.journal.model.JournalFolder) +(status:0)) (+(entryClassName:com.liferay.portlet.messageboards.model.MBMessage) +(status:0) +(discussion:false)) (+(entryClassName:com.liferay.portlet.wiki.model.WikiPage) +(status:0)))) +(assetCategoryTitles:*zkouska* assetCategoryTitles_cs_CZ:*zkouska* assetTagNames:*zkouska* comments:zkouska content:zkouska description:zkouska properties:zkouska title:zkouska url:zkouska userName:*zkouska* -stagingGroup:true city:zkouska country:zkouska emailAddress:*zkouska* firstName:zkouska fullName:zkouska lastName:zkouska middleName:zkouska region:zkouska screenName:zkouska street:zkouska zip:zkouska ddmContent:zkouska extension:zkouska fileEntryTypeId:zkouska path:*zkouska* classPK:zkouska content_cs_CZ:zkouska description_cs_CZ:zkouska entryClassPK:zkouska title_cs_CZ:zkouska type:zkouska articleId:zkouska)"
      }
   },
      "highlight": {
         "require_field_match" : true,
          "pre_tags" : ["<tag1>"],
        "post_tags" : ["</tag1>"],
         "fields": {
            "content": {}
         }
      }
}

Also on an unrelated note looking at the response it looks like the mapping for status, hidden fields seems to be string probably you want it to be boolean.