6
votes

We have an object with nested properties which we want to make easily searchable. This has been simple enough to achieve but we also want to aggregate information based on multiple fields. In terms of the domain we have multiple deals which have the same details with the exception of the seller. We need consolidate these as a single result and show seller options on the following page. However, we still need to be able to filter based on the seller on the initial page.

We attempted something like the below, to try to collect multiple sellers on a row but it contains duplicates and the creation of the index takes forever.

Map = deals => deals.Select(deal => new
{
    Id = deal.ProductId,
    deal.ContractLength,
    Provider = deal.Provider.Id,
    Amount = deal.Amount
});

Reduce = deals => deals.GroupBy(result => new
{
    result.ProductId,
    result.ContractLength,
    result.Amount
}).Select(result => new
{
    result.Key.ProductId,
    result.Key.ContractLength,
    Provider = result.Select(x => x.Provider).Distinct(),
    result.Key.Amount
});

I'm not sure this the best way to handle this problem but fairly new to Raven and struggling for ideas. If we keep the index simple and group on the client side then we can't keep paging consistent.

Any ideas?

1
Provider = result.Select(x => x.Provider).Distinct() you can't do this. Map/Reduce is distributed and no point can you assume you ever have the entire collection of Providers. The only trustworthy linq operators in the reduce are ones like Count() and Sum() because they are associative - Chris Marisic
I know this is distributed in platforms like Hadoop but are you sure this is actually distributed in RavenDB? - Alex Jones
yes, you cannot ever depend on having the full object set in reduce. That's not to say you won't have it, maybe even having the full set 99% of the time, but even if 1% of the time you don't you'll lead yourself into a minefield of misleading data. Running a select for .Provider in the reduce like that, you're ensuring your index will have data missing from it. - Chris Marisic

1 Answers

1
votes

You are grouping on the document id. deal.Id, so you'll never actually generate a reduction across multiple documents. I don't think that this is intended.