Azure DocumentDB Data Modeling, Performance & Price

Question

I'm fairly new to NoSQL type databases, including Azure's DocumentDB. I've read through the documentation and understand the basics.

The documentation left me with some questions about data modeling, particularly in how it relates to pricing.

Microsoft charges fees on a "per collection" basis, with a collection being a list of JSON objects with no particular schema, if I understand it correctly.

Now, since there is no requirement for a uniform schema, is the expectation that your "collection" is analogous to a "database" in that the collection itself might contain different types of objects? Or is the expectation that each "collection" is analogous to a "table" in that it contains only objects of similar type (allowing for variance in the object properties, perhaps).

Does query performance dictate one way or another here?

Thanks for any insight!

Larry Maccherone Larry Maccherone · Accepted Answer · 2017-03-14T02:26:48

The normal pattern under DocumentDB is to store lots of different types of objects in the same "collection". You distinguish them by either have a field type = "MyType" or with isMyType = true. The latter allows for subclassing and mixin behavior.

As for performance, DocumentDB gives you guaranteed 10ms read/15ms write latency for your chosen throughput. For your production system, put everything in one big "partitioned collection" and slide the size and throughput levers over time as your space needs and load demands. You'll get essentially infinite scalability and DocumentDB will take care of allocating (and deallocating) resources (secondaries, partitions, etc.) as you increase (or decrease) your throughput and size levers.

Azure DocumentDB Data Modeling, Performance & Price

2 Answers