I was watching this video about the CAP theorem, where the author explains well the trade-offs of distributed systems. However I disagree with the CAP theorem in the following aspect. Given the picture below:
Whenever there is a partition, in other words, whenever a slave loses its connection to the master, this slave immediately becomes unavailable. So you will say: You are choosing consistency over availability. And I will say NO!. My distributed system is still highly available because there are many other backup/redundant slave nodes that the client can failover to. So I'm keeping my consistency and I'm keeping my availability in the system. A failing slave node is immediately (and automatically) taking offline and the client is redirected to another slave node for reads.
Then you might say: now what happens if the master node dies, or if you have a partition where two master nodes are active? And the answer is simple: Your system must NEVER allow two master nodes to be active. Your system must always have one and only one master node with as many backup master nodes as you want, however all the backup master nodes will be inactive (i.e. not accepting writes and just building redundant state).
The only trade-off of such a system, because nothing is perfect: It will need human intervention for the case of a dying / bad state master, so that the active master can be shutdown by a human and guaranteed to be dead while the operator turns on (manually) one of the backup masters to take over write requests.
I've been thinking for a long time on how to eliminate this human intervention, but I don't think it is possible due to the fact that a machine cannot reliably determine the state of another machine in a distributed system. A human needs to make this decision and manually pull the plug to kill it.
Wouldn't this simple trade-off (human operator for the rare cases when the master is dying) beat the CAP theorem?