optimal architecture for multitenant application on django

Question

I've been brooding over the right/optimal way to create a multitenancy application based on Django.

Some explanation:

Application can be used by several tenants (tenant1, tenant2, ...,).
All tenant-individual data has to be secured against access of other tenants (and their users).
Optionally tenants can create additional custom-fields for application-objects.
Of course, underlying hardware limits number of tenants on one "system".

1) Separating each tenant by e.g. sub-domain and using tenant-specific databases in the underlying layer

2) Using some tenant-ID in the model to separate the tenant-data in the database

I am thinking about deployment-processes, performance of the system-parts (web-server(s), database-server(s), working-node(s),...)

What would be the best setup ? Where are the pro's and con's?

What do you think?

Thx :-) Any ideas or additional aspects, I do not mentioned ? — Oliver Rehburg
Django has its Sites framework built-in, which I think can be extended to have better multitenant support. There needs to be a way to identify which "site" we need to pull content for from something in the request, instead of hard coding which site in settings.py — Brandon
No-sql support can also make multitenant easier but, there's not currently a good way that I'm aware of to tell which shard a particular site's data sits on. — Brandon
After looking into using the Sites framework, I came to the conclusion that its purpose is to make sharing data between different sites easier -- which is exactly the opposite of what you want for a multi-tenant app, for data security reasons. — Nexus

Reto Aebersold Reto Aebersold · Accepted Answer · 2011-08-25T17:56:29

We built a multitenancy platform using the following architecture. I hope you can find some useful hints.

Each tenant gets sub-domain (t1.example.com)
Using url rewriting the requests for the Django application are rewritten to something like example.com/t1
All url definitions are prefixed with something like (r'^(?P<tenant_id>[\w\-]+)
A middleware processes and consumes the tenant_id and adds it to the request (e.g. request.tenant = 't1')
Now you have the current tenant available in each view without specifying the tenant_id argument every view
In some cases you don't have the request available. I solved this issue by binding the tenant_id to the current thread (similar to the current language using threading.local )
Create decorators (e.g a tenant aware login_required), middlewares or factories to protect views and select the right models
Regarding to the databases I used two different scenarios:
- Setup multiple databases and configure a routing according to current tenant. I used this first but switched to one database after about one year. The reasons were the following:
  - We didn't need a high secure solution to separate the data
  - The different tenants used almost all the same models
  - We had to manage a lot of databases (and didn't built an easy update/migration process)
- Use one database with some simple mapping tables for i.e. users and different models. To add additional and tenant specific model fields we use model inheritance.

Regarding the environment we use the following setup:

Nginx
uWSGI
PostgreSQL
Memcached

From my point of view this setup has the following pro's and con's:

Pro:

One application instance knowing the current tenant
Most parts of the project don't have to bother with tenant specific issues
Easy solution for sharing entities between all tenants (e.g. messages)

Contra:

One quite large database
Some very similar tables due to the model inheritance
Not secured on the database layer

Of course the best architecture strongly depends on your requirements as number of tenants, the delta of your models, security requirements and so on.

Update: As we reviewed our architecture, I suggest to not rewrite the URL as indicated in point 2-3. I think a better solutions is to put the tenant_id as a Request Header and extract (point 4) the tenant_id out of the request with something like request.META.get('TENANT_ID', None). This way you get neutral URLs and it's much easier to use Django built-in functions (e.g. {% url ...%} or reverse()) or external apps.

optimal architecture for multitenant application on django

3 Answers