0
votes

Currently i have three cores in solr and my requirement is to search over all core.Cores have similarities in their fields name like below,

core1: schema: field A, field B, field C, field D, field E

core2: schema: field A, field M, field N, field D

core3: schema: field X, field Y, field B, field N

( Don't go for the above patterns in fields, those are to explain that there are few similarities and dissimilarities in fields in these 3 cores.)

Now to search all over these cores altogether, i have implemented below two solutions:

  1. I have kept all three cores same as it is and created one core and put one schema that is union of all three schema from all three cores, and in solr config put the parameter, "shards" with the address of three cores. Basically this newly created core is not having any indexed data and when we are searching this core it is actually redirecting it to three cores and clubbing the results and returning it i guess. How to Search Multiple SOLR Core?

  2. Put all core in one core. creating one schema on that core that is union of all three schema from all three cores and making all fields optional(required=false), accomplished in a single index with an additional field to select what type to search. https://wiki.apache.org/solr/MultipleIndexes

Now both these solutions work perfectly, but i am not sure which one to adopt. So looking for a clear pros and cons(i know few but not very clear) to select the better.

Also want to know is there will be any difference in score calculation or relevance at the time of searching for these above two approaches if we keep all setting same in all places for these two.

1
You should use some kind of wrapper around all cores to get results and merge. solr doesn't provide this feature.Vinod
Above two approaches giving me results perfectly after merging only. But i am not sure which one to select?souro

1 Answers

0
votes

Choosing one of your above two approaches will be dependent on the combined index size that will be growing. If the index size combined all the three cores is expected to handle by a single machine, then it is better to choose single core approach, else choose the multiple cores().

With respect to scoring, Solr does not calculate universal term/doc frequencies. You can refer this confluence link for further reference