3
votes

I had a search system that was working fine with Solr 4.9 with out-of-the-box configuration and schema.

In Solr 6.2 the highlight entries are being returned in the result, but they contain document ids only - no highlight text.

At first I thought it was because "content" is not in the default (managed) schema, but adding it did not make any difference. In any event there are other fields that were in the default schema (author, subject, title,...) that could return highlight text but I'm not getting anything for these either.

Highlighting does seem to be enabled in the default configuration, and my query seems ok (it also does not return any HL text when run from the Admin interface, so that rules out my code):

q=mike&rows=10&start=0&sort=score desc, last_modified desc&wt=json&fl=content_type,fsb_doctype,author,id,last_modified,subject,title,score,url&hl=true&hl.fragsize=250&hl.fl=content,author,subject,title&hl.simple.pre=%3Cb%3E&hl.simple.post=%3C%2Fb%3E&facet=true&facet.field=fsb_doctype&facet.field=fsb_origin&facet.field=fsb_mission

The only schema changes I have made are the addition of 3 custom fields used in faceting.

Here's a sample of the json returned on the Solr 4.9 system (just the highlighting section):

"highlighting":{"/event_20060718.cfm":{"content":[" \n \n \n \n \n \n \n Code 582 - Events \n \n July 18, 2006 - Flight Software Branch <b>CMMI</b> <b>Appraisal</b> \n\n \nThe Flight Software Branch was successful in its latest <b>CMMI</b> <b>appraisal</b> - a SCAMPI Class A <b>Appraisal</b> of the Supplier Agreement Management (SAM"]},"/SDODocs/":{"content":[" \n \n \n \n \n \n \n SDO Flight Software \n \n \t \t SDO Flight Software Baselined Documents \n(Restricted access - contains only assets used in the <b>CMMI</b> <b>appraisal</b>. For access to up-to-date SDO FSW documents, see M W.) \n \t \t SDO Flight"]},"/LRO/":{"content":[" \n \n \n \n \n \n \n LRO Flight Software \n \n \t \t LRO Flight Software Baselined Documents \n(Restricted access - contains only assets used in the <b>CMMI</b> <b>appraisal</b>. For access to up-to-date LRO FSW documents, see Mike B.) \n \t \t LRO Flight"]},"TDL_582 Web&id=501":{"content":[" \n \n \n \n \n \n \n Action item from 582 Web group - group id 501 - closed item \n \n Michael \n Mike Tilley \n Update Events - <b>CMMI</b> <b>appraisal</b> results are official today! \n 09/18/14 - updated/created & deployed:\n\n/default.cfm\n\n/events.cfm"]},"TDL_582 Web&id=203":{"content":[" \n \n \n \n \n \n \n Action item from 582 Web group - group id 203 - closed item \n \n Michael \n Mike Tilley \n <b>CMMI</b> <b>Appraisal</b> results are \"official\" today - post an event. \n 10/11/11 - updated/created & moved to production (& Linux)..\n\n/default.cfm"]},"TDL_BSR FPI&id=1":{"content":[" \n \n \n \n \n \n \n Action item from BSR FPI group - group id 1 - closed item \n \n Michael \n Mike Tilley \n Action: send a BSR template, customized for FPI, to Victor. \n 12/15/10 - done (distracted by <b>CMMI</b> <b>Appraisal</b>!) \n "]},"/event_20080516.cfm":{"content":[" \n \n \n \n \n \n \n Code 582 - Events \n \n May 16, 2008 - Center <b>CMMI</b> <b>Appraisal</b> \n\n\n A <b>CMMI</b> SCAMPI Class A <b>Appraisal</b> for GSFC was completed successfully on Friday, May 16th. GSFC is now compliant with Agency policy regarding <b>CMMI</b> Maturity Level 2 for"]},"/event_20140918.cfm":{"content":[" \n \n \n \n \n \n \n Code 582 - Events \n \n September 18, 2014 - Center <b>CMMI</b> <b>Appraisal</b> \n\n\n A <b>CMMI</b> SCAMPI Class A <b>Appraisal</b> for GSFC was completed successfully on Monday, September 15th. GSFC continues to be compliant with Agency policy regarding <b>CMMI</b>"]},"TDL_582 Web&id=353":{"content":[" & we made the mistake of trying to use the same repository for the team, and for <b>CMMI</b> <b>appraisal</b> evidence. As a result, the team never used it, and it contains sensitive <b>appraisal</b> data. Closing this AI. \n "]},"TDL_582 Web&id=67":{"content":[" \n \n \n \n \n \n \n Action item from 582 Web group - group id 67 - closed item \n \n Michael \n Mike Tilley \n Had to find the SEI certification page for the GSFC <b>CMMI</b> <b>appraisal</b> - it wasn't easy. Should probably post a link to this on the website"]}

And here's a sample of the 6.1 json results (slightly different document set on this server, but the first result is the same in both cases:

"highlighting":{"/event_20060718.cfm":{},"/event_20080516.cfm":{},"/event_20140918.cfm":{},"/event_20111011.cfm":{},"/LRO/":{},"/SDODocs/":{},"/event_20060201.cfm":{},"TDL_582 Web&id=353":{},"TDL_582 Web&id=67":{},"TDL_BSR JWST&id=94":{}

So all I'm getting is a list of document ids with no highlight text.

1
Please add an example of response expected and returned.AR1
sample added - hope this helps.Mike
I have the same problem but if I fill an * in hl.fl I can see something.ziyuang
Also I noticed that if some text inside an array is matched, it cannot be highlighted in this way, yet I don't know how to highlight the text in this case.ziyuang

1 Answers

1
votes

Changing your code to something like this should work:

select?q=I+WANT+PIZZA&wt=php&indent=true&fl=id,name&group=true&group.field=content&hl=on&hl.fl=*&hl.encoder=html&hl.fragmenter=regex&hl.regex.slop=100.0&hl.fl=text_*&hl.bs.type=WHOLE&hl.defaultSummary=true&hl.offsetSource=POSTINGS