2
votes

I've had hard time explaining and finding what I need so please put your self in my shoes for a moment.

My requirement comes from a relational database background. I may be using Solr to do something it wasn't designed to do, or may be it can do what I need, I still need to confirm that. Hopefully you can assist me.

After indexing numerous documents into Solr. I need to retrieve distinct documents based on a filter. Just think about it as retrieving distinct rows while also applying a WHERE condition.

For example, in a relational database, I may have the following columns

(Country)  (City)     (Whatever)
 Egypt      Cairo      Hospitals
 Egypt      Alex       Schools
 Egypt      Mansoura   Hospitals
 Egypt      Cairo      Schools

If I perform this query: SELECT DISTINCT Country, City FROM mytable

I should get the following rows

(Country)  (City)
 Egypt      Alex
 Egypt      Mansoura
 Egypt      Cairo

Now after indexing the original table (SELECT * FROM mytable), how can I achieve the SAME output from Solr ? How can I retrieve documents by saying that I need these documents to be distinct based on some fields ? I will also need to apply a not null filter for a specific field.

I don't need statistics of any kind, I only need to get the documents.

I hope I was clear enough. Thank you for your time.

3

3 Answers

2
votes

this would be achievable with field collapsing by grouping by multiple fields, but unfortunately only one field is supported right now. There is an open issue, check it out.

0
votes

Did you try with facet? You should do somethings like this:

http://localhost:8983/solr/select/?q=*:*&facet=on&facet.field=city&facet.field=country

he will return you all the city (with a distinct) and the his count. Here there is the wiki if you want to learn more about it.

I hope this help you.

0
votes

Another good solution available from Solr 4 is based on Pivot (Decision Tree) Faceting.

Try with:

/solr/collection1/select?q=*:*&facet=true&facet.pivot=Country,City

This should return:

  "facet_counts" : {
        "facet_queries" : {},
        "facet_fields" : {},
        "facet_dates" : {},
        "facet_ranges" : {},
        "facet_pivot" : {
           "Country,City" : [ {
                 "field" : "Country",
                 "value" : "Egypt",
                 "count" : 4,
                 "pivot" : [ {
                       "field" : "City",
                       "value" : "Cairo",
                       "count" : 2
                 }, {
                       "field" : "City",
                       "value" : "Alex",
                       "count" : 1
                 }, {
                       "field" : "City",
                       "value" : "Mansoura",
                       "count" : 1
              } ]
           } ]
        }
  }