1
votes

Hy,

I've noticed some differences when querying Solr with Java and PHP. The query looks like this one here:

text:(www)+timestamp:[2012-04-16T00:00:00Z TO 2012-04-20T23:59:00Z]&q.op=AND&rows=0&sort=timestamp%20desc&facet=true&facet.field=terms_nouns_lemma&facet.limit=20&facet.method=enum

when printing out the number of documents found in Java

response.getResults().getNumFound()

I get almost 80.000, and the same in PHP

$response->response->numFound

returns around 7000. the PHP result seems to be more accurate as only a time frame needs to be considered (and due to the nature of the documents stored). But, when I go to the admin page and insert my query I again get around 80.000 (it's the same value actually as with Java).

What am I missing here?

To me it seems that Java doesn't consider the time frame at all? Maybe worth mentioning is that I'm using Solr 3.5 (and the Java library SolrJ is the corresponding version)

Note I think this question is pretty much related, but it didn't answer the question I have as it doesn't take restrictions into considerations (as the time frame in the query above).

Additionally in PHP, if I don't set the number of rows I want to have in my response, it actually returns the correct amount of documents that were found, is there any equivalent in Java w/ SolrJ (per default, if row isn't set, it will be set to 10, setting it to -1 isn't working either)

Thanks for any hints

Update

as posted in the comments below the difference in the query is that SolrJ replaces a blank/space with a "+", I tried escaping it hardcoded and with the use of ClientUtils.escapeQueryChars(String), but both didn't work as expected

What's really funny as well:

text:(www)&facet.range=timestamp&f.timestamp.facet.range.end=2012-04-16T21:59:59.000Z&f.timestamp.facet.range.gap=+1MINUTE&rows=0

returns the same number of documents as

text:(www)
1
Check your Solr logs for the exact queries that are being called ..Ansari

1 Answers

1
votes

Have you validated that the query being executed against the solr index is the same for both the SolrJ and PHP queries? Especially considering that you are saying the SolrJ query is not limiting by the date range you have specified. That would make me suspicious that something is not being setup/passed correctly from SolrJ.

Also with regard to returning all the rows, you can set the rows within SolrJ to an absurdly large number (around 100,000) should work in this case for you, based on your counts.