NoSql RavenDb data design - maximum maps produced issue

Question

We are using RavenDb as our NoSql datastore. We will store a huge amount of data that needs high performance on the queries.

Our current relational model is translated to a document structure, i will use non relevant Class names because i cannot disclose the real business case:

Company

Has it's own properties and contains a list of stores (around 50)

Store

Has it's own properties and contains a list of Product (around 150)

Product

Has it's own properties

We have stored this as Json:

{
  "Id": 1,
  "CompanyName": "first company",
  "Location": "Not so far",
  "Stores": [
      {
        "Id": 125,
        "CompanyId": 1,
        "StoreName": "sweets",
        "Products": [
            {
              "Id": 1289,
              "StoreId": 125,
              "Name": "Candy",
              "Price" : 10
            }
        ]
      }
    ]
}

We want to expose each sublist of the document with an OData service.

We added an index map to the selected level but then we got a RavenDb error that we are only allowed 15 mapped outputs per document.

We could alternatively store the lowest level (Product) and duplicate all Store and Company data. But this seems a bad design.

Jens Pettersson Jens Pettersson · Accepted Answer · 2015-10-01T06:19:16

You're creating what's called a Fanout Index and RavenDB has limit the number of fanouts to 15 by default. This is partly due to the fact that fanout indexes are really hard to optimize for performance by the db.

In your case, it looks like you're going to store one document per "Company" that holds all that company stores and each store holds all the products. If that's really the case, the document might be rather large and migh cause concurrency problems IF something in the document is modified by more than one person at the same time.

Consider storing the Company as one document that holds an Id reference to each of its stores

{
    "Id": "Companies/1",
    "CompanyName": "first company",
    "Location": "Not so far",
    "Stores": [
            "Stores/1",
            "Stores/2"
        ]
}

and each Store (also separate document) holds a reference to each product:

{
    "Id": "Stores/1",
    "CompanyId": "Companies/1",
    "StoreName": "sweets",
    "Products": [
        "Products/1",
        "Products/2"
    ]
}

and the products (also separate document) holds a reference back to its store (if a product can only be in one store, as it looks like in your example):

{
  "Id": "Products/1",
  "StoreId": "Stores/1",
  "Name": "Candy",
  "Price" : 10
}

Now you can create any kind of index you want, like Products_ByStore, Stores_ByCompany etc. You can also do Products_ByStore_AndByCompany if you decide to store the CompanyId directly on the Product (or use a LoadDocument inside the Products index that loads the store and maps the stores Company id, but try to avoid that if possible).

If you want to return something in the same form as your json example you CAN do a LoadDocument on the CompanyId and use a ResultTransformer that loads all the stores and all their products on the server and returns everything. However, this might be a rather "expensive" operation, based on how many stores and products a company have, but at least it's server side (kind of like a JOIN in SQL).

Don't be afraid to use relations in a document db like Raven, but think about how the usage will be. You can combine the relation by id with more information about a store, like the name, if you're often loading a company and want to list all the stores by name. The document might look like this:

{
    "Id": "Companies/1",
    "CompanyName": "first company",
    "Location": "Not so far",
    "Stores": [
            { "Id": "Stores/1", "Name": "sweets" },
            { "Id": "Stores/2", "Name": "another store name" }
        ]
}

Now the store name is duplicated and will of course have to be updated on the Company document as well if changed in the Store document.

As always, it depends. Hope this have given some help at least!

//J

NoSql RavenDb data design - maximum maps produced issue

2 Answers