0
votes

I have two queries running on same set of fields in Solr and return of unique ID (another solr field). How do i calculate intersection of two queries or only unique Id from two search query. I do know that i can run these separately on client side and find the unique ones but i want to know is there a way i can do this with single search sent to SOLR. Here is how everything looks at my side.

Solr Fields:

<fields>
    <field name="key" type="uuid" indexed="true" required="true"/>
    <field name="tagname" type="string" indexed="true"  required="false"/>
    <field name="tagvalue" type="string" indexed="true" required="false"/>
</fields>

Now what i want to do is

(tagname:xyz and tagvalue:123)&fl=key

this would return a list of key

(tagname:abc and tagvalue:456)&fl=key

this would also return a list of key

now get the intersection of/unique keys from above two lists.

Can this all process be done in one step by running some kind of solr intersection query?

Or is there any other solr schema design i need to take? i am open for that.

2
Your unique id field is not a uniqueKey / unique field in the index, right? There's one document for each tagname/tagvalue combination, with a repeating "key" value? (i.e. 123/foo/bar, 123/abc/value, 123/xyz/baz)MatsLindh
that is correct, nique id field is not a uniqueKey / unique field. There is one document for each key/tagname/tagvalue combination, key/tagname/tagvalue combination will always uniquely define a document.user3799300

2 Answers

0
votes

Solr likes denormalized data. With your existing schema, you would need to run two queries and intersect the results on the client. However, a slightly different schema would enable what you're looking for:

<fields>
    <field name="key" type="uuid" indexed="true" required="true"/>
    <field name="tags" type="string" indexed="true" multiValued="true" required="false"/>
</fields>

One way you could use this schema is to index tags as <name>_<value>, with all the tags for a given key in the same document. It's more work to build the index, but at query time you could do q=tags:xyz_123 AND tags:abc_456&fl=key and get the results you want with a single query. Atomic updates can help you out with building or maintaining the index but it does requiring storing all fields.

0
votes

A filter query will do what you want. They're specified as fq params on your query and are intersected with the main query result. For example:

q=(tagname:xyz and tagvalue:123)&fq=(tagname:abc and tagvalue:456)&fl=key

The following will produce the same result:

q=*:*&fq=(tagname:xyz and tagvalue:123)&fq=(tagname:abc and tagvalue:456)&fl=key

The second form may be slightly quicker to execute as well, because q=*:* is constant-scoring and filter queries aren't scored. Based on your queries I'm guessing that scoring isn't a big deal for you.

Edit: this answer is completely wrong! See comments.