0
votes
  1. Field Type:

    fieldType name="StrCollectionField" class="solr.StrField" omitNorms="true" multiValued="true" docValues="true"

    field name="po_line_status_code" type="StrCollectionField" indexed="true" stored="true" required="false" docValues="false"

    po_no is PK

  2. Index value: po_line_status_code:[3700.100]

  3. Search Query: po_line_status_code:(1100.200 1100.500 1100.600 1100.400 1100.300 1100.750 1100.450)

Result: Getting Results with po_line_status_code: [3700.100] as well.

Does Solr internally tokenize solr.StrField containing dots or is some regular expression matching going on here? Sounds like a bug to me.

We don't get this document, when we change the query to one of the following 1> po_line_status_code:(1200.200 1200.500 1200.600 1200.400 1200.300 1200.750 1200.450) 2> po_line_status_code:(1100.200 1100.500 1100.600 1100.400 1100.300 1100.750 1100.450) AND po_no:938792842

We are using DSE version: 4.7.4 having Apache Solr 4.10.3.0.203.

Debug Query Output from one the servers which is returning wrong documents: response={numFound=2,start=0,docs=[SolrDocument{po_no=4575419580094, po_line_status_code=[3700.4031]}, SolrDocument{po_no=1575479951283, po_line_status_code=[3700.100]}]},debug={rawquerystring=po_line_status_code:(3 1100.200 29 5 6 1100.300 63 199 1100.500 200 1100.600 198 1100.400 343 344 345 346 347 409 410 428 1100.750 1100.450) ,querystring=po_line_status_code:(3 1100.200 29 5 6 1100.300 63 199 1100.500 200 1100.600 198 1100.400 343 344 345 346 347 409 410 428 1100.750 1100.450)]

I also see the below thing in the response which I believe has something do with ranking or so:

No match on required clause (po_line_status_code:3 po_line_status_code:1100.200 po_line_status_code:29 po_line_status_code:5 po_line_status_code:6 po_line_status_code:1100.300 po_line_status_code:63 po_line_status_code:199 po_line_status_code:1100.500 po_line_status_code:200 po_line_status_code:1100.600 po_line_status_code:198 po_line_status_code:1100.400 po_line_status_code:343 po_line_status_code:344 po_line_status_code:345 po_line_status_code:346 po_line_status_code:347 po_line_status_code:409 po_line_status_code:410 po_line_status_code:428 po_line_status_code:1100.750 po_line_status_code:1100.450)\n 0.0 = (NON-MATCH) product of:\n 0.0 = (NON-MATCH) sum of:\n 0.0 = coord(0/23)\n 0.015334824

Also, could it be something to do with re-indexing? If I re-index my documents will it fix the issue?

The links to doc file containing solr schema and solr config can be found here

1
What version are you on?Bereng
What does debugQuery show? StrField should only give exact matches. Have you tried just issuing a single query against the value? (I'm not sure if your example would match even if tokenization was happening, as neither numbers are identical)MatsLindh
Works fine on solr 4.10.4; there is no tokenizing going on on that field and no "regular expression" matching either. It is hard to see what you would match anyway even if there was tokenizing.David George
@MatsLindh: I have added my debug query output in the question itself now.Akshay
@DavidG: Are you implying that there is an issue on the version which I mentioned above?Akshay

1 Answers

0
votes

I've had to put this in an answer as the comments won't allow formatting.

No it's not a version problem or a tokenizer problem or a bug in solr.

solr.StrField won't tokenize on either analysis or query. It is matching on something else. Can you post solrconfig.xml and schema.xml?

If you are searching on po_line_status_code this is the debug you should see:

"querystring": " po_line_status_code:(1100.200 1100.500 1100.600 1100.400 1100.300 1100.750 1100.450)",
    "parsedquery": "(+(po_line_status_code:1100.200 po_line_status_code:1100.500 po_line_status_code:1100.600 po_line_status_code:1100.400 po_line_status_code:1100.300 po_line_status_code:1100.750 po_line_status_code:1100.450))/",

Whereas what you are seeing is

querystring=ship_node:610055 AND po_line_status_code:(3 1100.200 29 5 6 1100.300 63 199 1100.500 200 1100.600 198 1100.400 343 344 345 346 347 409 410 428 1100.750 1100.450) AND expected_ship_date:[2016-02-03T16:00:00.000Z TO 2016-06-09T13:59:59.059Z]

So your query string has been altered. I assume all your queries are through the solr admin tool? So that should leave DSE out of the loop.

I still wouldn't expect your query to match but things are more complicated than you have presented them as you have ship_node and expected_ship_date in your query too.

Oh the No match on required clause says that you didn't match anything with the po_line_status_code query.