16
votes

I have recently started using Cosmos DB for a project and I am running into a few design issues. Coming from a SQL background, I understand that related data should be nested within documents on a NoSQL DB. This does mean that documents can become quite large though.

Since partial updates are not supported, what is the best design pattern to implement when you want to update a single property on a document?

Should I be reading the entire document server side, updating the value and writing the document back immeadiately in order to perform an update? This seems problematic if the documents are large which they inevitably would be if all your data is nested.

If I take the approach of making many smaller documents and infer relationships based on IDs I think this would solve the read/write immeadiately for updates concern but it feels like I am going against the concept of a NoSQL and in essence I am building a relational DB.

Thanks

3
An excellent question. It looks as though the community are also asking it: feedback.azure.com/forums/263030-azure-cosmos-db/suggestions/… The implication here is that the 'small document / infer relationships' pattern is the way to go, for now. Would be lovely to see a white paper or similar on how small a 'small document' is.Holf
Note that the limit in Cosmos DB for documents is 2 MB, so you are forced to use relatively small files.influent

3 Answers

5
votes

Locking and latching. That's what needs to happen if partial updates become possible. It's a difficult engineering problem to keep a <15ms write latency SLA with locking.

This seems problematic if the documents are large which they inevitably would be if all your data is nested.

Define your fear — burnt Request Units, app host memory, ingress/egress network traffic? You believe this is a problem but you're not stating concrete results. I'm not saying you're wrong or doubting the efficiency of the partial update approach, i'm just saying the argument is thin.

Usually you want to JOIN nothing in NoSQL, so i'm totally with you on the last paragraph.

2
votes

Whenever you are trying to create a document try to consider this:

  • Does the part of document need separate access . If yes then create a referenced document and if no then create a embedded document.

    And if you want to know what to choose, i think you should need to take a look at this question its for MongoDb but will help you Embedded vs Referenced Document

0
votes

Embed or Reference is the most common problem I face while designing document structure in NoSQL world.

In embedded relationship, child entities has been embedded into the parent document. In Reference relationship, child entities in separate documents and their parent in another document, basically having two (or more) types of documents.

There is no one relationship pattern fits all. The approach you should take depends on the Retrieve and Update to be done on the data is being designed.

1.Do you need to retrieve all the child entities along with the parent entities? If Yes, use embedded relationships.

2.Do your use case allow entities being retrieved individually? This case use relationship pattern.

Majority of the use cases I have worked, I used relationship pattern. For example: Social Graph (Profiles with Relationship Tree), Proximity Points (GeoJSON based proximity search), Classified Listing etc.

Relationship Pattern is also easier to update and maintain, as the entities are stored in individual documents.