How to filter values returned on a multivalued field in Solr

Question

I have a document with a field called uuids. This field is a list (multivalued) can have up to 100k values per document.

I want to search for documents that match uuids that start with "5ff6115e" for instance. I can already do it successfully by using q=uuids:5ff6115e*:

http://localhost:8983/solr/test1/select?q=uuids%3A5ff6115e*&rows=1&fl=uuids&wt=json&indent=true

However, the resultant document brings me all 100k values for this field.

What I want is not only filter the documents whose uuids field start with this value, but also filter the field values returned so that I will only receive specific values in the answer.

How to do that?

It's a different question, as I don't want to filter just which fields will come in the result, but the values — mvallebr

David Smiley David Smiley · Accepted Answer · 2015-04-10T15:30:49

Use highlighting. @Jokin first mentioned it and I feel this is the best answer without hacking on Solr. Try either the PostingsHighlighter or the FastVectorHighlighter, not the default/standard highlighter. Unfortunately both of them internally execute a wildcard query against all UIDS in this field. FVH has the opportunity internally to be smarter about that but it's not implemented that way.

note: if it's within scope to write a little Java to add to Solr, the ideal answer would be to add term vectors (just the terms data in the term-vector, no offsets/positions) and then write a "DocTransformer" to grab the term vector terms; seek to the prefix, then iterate on those that have that prefix. Pretty darned fast.

How to filter values returned on a multivalued field in Solr

3 Answers