1
votes

I am building a Multi-tenant SaaS app, where I have many customers (can span from hundreds to thousands). Each of these customers have customers of their own, who can make accounts and interact on our app. I am using Node for my server and MongoDB for the Database.

Currently, my approach to storing my tenants' data is placing all their data in one database and to use Mongodb Sharding to partition all my tenant's data and scale up when I need it in the future. MongoDB sharding seems to me like a great way to partition and manage my data.

However, there are some people who recommend that I should have one database for each tenant. They say it is better, as it is easier to "migrate/manage/scale" and "more secure". I would like to get a second opinion on this.

Giving each tenant a database increases the complexity of my app, so I would like to know if its really necessary.

I would be grateful for any insights. Thank you in advance for your answer!

1

1 Answers

4
votes

These are my opinions and would love to discuss.

There are several well-known ways to store tenant based data. I think it depends on your solution, budget, size of the team and where you want to put the complexity.

  • Database system instance per tenant: We generally use this method for on-premise solutions. Tenants use and manage their own database instance on their own server/cloud.
  • Database per tenant: This is the most secure way to isolate tenants' data from each others for a cloud solution. But needs extra efforts for maintenance and management (backing-up, development changes, reindexing, etc). Furthermore, the application should be able to handle / pool the connection for per database.
  • Schema per tenant (in MongoDb world this will not be possible since it does not have schema)
  • Table/collection per tenant:
  • Row / document per tenant: Provides weak isolation. And have to be optimized well for each kind of queries. However, easiest way to maintain.

Considering a solution that is maintained by a very small team (2 or 3 people): I would use document-based isolation (with a field tenantId) and have several sharded clusters to scale the tenant data by using tenants' names' initials as sharding key.

  • cluster#01: Tenants with the names starts with A-H
  • cluster#02: Tenants with the names starts with H-S
  • ...

Considering a cloud solution that is used by thousands of customers worldwide and maintained by a large team: Probably I will go with having sharded clusters (in different country locations), and spread the tenants into the respective sharded cluster by tenant's location and have database per tenant.

Considering an enterprise solution: I prefer to offer on-premise to customers.

Additionally to all above: I can consider to use slave replica sets to execute the read operations rather than using the primary instance.