17
votes

If I understand the CAP Theorem correctly, availability means that the cluster continues to operate even if a node goes down.

I've seen a lot of people (http://blog.nahurst.com/tag/guide) list RDBMS as CA, but I do not understand how RBDMS is available, as if a node goes down, the cluster must go down to maintain consistency.

My only possible answer to this has been that most RDBMS are a single node, so there is no "non-failing" node. But, this seems to be a technicality, not true 'availability' and definitely not high availability.

Thank you.

3

3 Answers

13
votes

First of all, let me clarify and state that the consistency in RDBMS is different than consistency in distributed systems. RDBMS (single system) applies consistency to transactional consistency, where as in distributed systems consistency means view from anywhere in the system (read from any node) is consistent. So RDMBS single node cannot be discussed with regards to CAP theorem. It is like comparing apple to orange.

RDBMS with master-slave can be compared to distributed systems. Here RDBMS can be configured to CA/CP or AP. MySQL for example, provides a way to configure the system in a way that if there is a quorum loss (not enough secondary available for commit log replication), the cluster is not available (CP system). MySQL also provides a configuration to allow the cluster to operate as long as master is available (CA system) with the potential of data loss. SQL Server AlwaysOn is an AP system, because commit log replication is asynchronous (even on sync replicas).

So RDBMS can be any of CA, CP or AP in a distributed world.

1
votes

I believe you are misunderstanding the relation between CAP-Availability and node-UP/DOWN. Availability is about providing an answer to every received query - when a node is down it cannot receive queries, therefore if you bring down parts of or the entire cluster, the CAP-Availability property holds. Although this may sound counter intuitive at first glance, by shutting down nodes you are holding on to CAP-Availability and dropping CAP-Partition tolerance instead. I've recently posted an answer whose examples provide some clarification.

In a nutshell: A partition occurs that isolates node N. If N receives a request it can either: i) answer which grants availability but drops consistency because N is out of sync; ii) do not answer to avoid replying with an out-of-date result, thereby dropping availability because we received a request but issued no reply for it.

Alternatively we can shutdown N as soon as it becomes disconnected from the rest of the cluster which allows us to keep C and A, but drop P, because: i) N will not receive any requests; ii) all received requests will be performed to the fully connected and consistent cluster, hence they will all be answered with consistent values; iii) the cluster is not partition tolerant because it does not tolerate partitions - instead it shutdowns partitioned nodes.

-1
votes

In CAP Theorem P is for Partition tolerance , which is the ability of system to handle partitions(partitions are isolated clusters - due to network failure or any other reason ..).

In a distributed network to handle a partition , system has to pick either Consistency or Availability.

In case of RDBMS there is no chance for partitions (assuming not distributed which is normal case) ,So Those will be always CA.