0
votes

I am designing common web-based CRM application for huge number of tenants. Users (tenants) sign up online for use of the application. Initially, there will not be so many users but in future there are possibilities.

I want to use single shared MySQL database. It will be impossible to create separate database for each tenant because of the chosen scenario and future functionality integrations. The programming will be in PHP.

But, how should I address data scalability issues:

  1. What if rows in the table exceed the size of table. How to address this issue?
  2. If I use auto increment BIGINT primary key, for example for 'contacts' table. What will happen after the largest value of BIGINT reached?
  3. Is it best practice to use foreign key constraints in real huge data tables? How will it affect the performance of the application, if used or not used?
  4. Will MySQL be good fit for this kind of applications?
  5. What is Zoho CRM's database technique for multi-tenancy?
1
Merely asking these questions suggests that you have nothing to worry about, because you don't understand numbers and scaling. The largest bigint is 9,223,372,036,854,775,807. Given that that is about a billion times larger than the number of humans, I wouldn't worry about your contacts table overflowing -- assuming that your contacts are human. - Gordon Linoff
I had to look that number up as I had no idea just how big but "nine quintillion two hundred twenty-three quadrillion three hundred seventy-two trillion thirty-six billion eight hundred fifty-four million seven hundred seventy-five thousand eight hundred seven" makes my head spin - Professor Abronsius

1 Answers

1
votes

MySQL is pretty good a scaling up, even with enormous tables. Basically you can just put your database on larger and more powerful servers to handle the demand. In my experience it's usually limited by RAM.

Once that technique starts getting dicey you can scale out by creating read replicas of the database. Basically these are read-only copies of your master database that are continuously synchronized with the master. In your application use two different database connections. The first connection is to a read-replica and is used for all SELECT statements. The other connection is to your master to be used for all INSERT, UPDATE, and DELETE statements. Since many applications do more SELECTs than anything else and there is very little limit on how many read-replicas you can create this will greatly expand your potential scale.

In MySQL I tend to use a single database for all tenants and segment the data by using different database usernames for each tenant. Through a tenant_id column and views that filter by the tenant_id I can assure that tenants don't have any access to other tenant's data. I wrote a blog post on how I was able to convert a single-tenant application to multi-tenant in a weekend: https://opensource.io/it/mysql-multi-tenant/

Having a single database and single codebase for all tenants is much easier to maintain than multiple databases or schemas.