NoSQL: What does it mean for MongoDB or BigTable to not always be "Available"

Question

Reading Nathan Hurst's Visual Guide to NoSQL Systems, he includes the CAP triangle:

Consistency
Availibility
Partition Tolerance

enter image description here

With SQL Server being an AC system, and MongoDB being a CP system.

These definitions from come a UC Berkley professor Eric Brewer, and his talk at PODC 2000 (Principles of Distributed Computing):

Availability

Availability means just that - the service is available (to operate fully or not as above). When you buy the book you want to get a response, not some browser message about the web site being uncommunicative. Gilbert & Lynch in their proof of CAP Theorem make the good point that availability most often deserts you when you need it most - sites tend to go down at busy periods precisely because they are busy. A service that's available but not being accessed is of no benefit to anyone.

What does it mean, in the context of MongoDB, or BigTable, for the system to not be "available"?

Do you go to connect (e.g. over TCP/IP), and the server does not respond? Do you attempt execute a query, but the query never returns - or returns an error?

What does it mean to not be available?

Chris Shain Chris Shain · Accepted Answer · 2011-09-07T19:41:16

Availability in this case means that in the event of a network partition, the server that a client connects to may not be able to guarantee the level of consistency that the client expects (or that the system is configured to provide).

Assuming that you have 3 nodes, A, B, and C, in a hypothetical distributed system. A, B, and C are each running in their own rack of servers, with 2 switches between them:

[Node A] <- Switch #1 -> [Node B] <- Switch #2 -> [ Node C ]

Now assume that said system is set up so that it is GUARANTEED that any write will go to at least 2 nodes before it is considered committed. Now, lets assume that switch #2 gets unplugged, and some client is connected to node C:

[Node A] <- Switch #1 -> [Node B]                 [ Node C ] <-- Some client

That client will not be able to issue Consistent writes, because the distributed system is currently in a partitioned state (namely, Node C cannot contact enough other nodes to guarantee the 2-node consistency required).

I'd add to this that some NoSQL databases allow very dynamic selection of CAP attributes. Cassandra, for instance, allows clients to specify the number of servers that a write must go to before it is committed on a per-write basis. Writes going to a single server are "AP", writes going to a quorum (or all) servers are more "CA".

EDIT - from the comments below:

In MongoDB you can only have master/slave configuration within a replica set. What this means is that the choice of AP vs CP is made by the client at query time. The client can specify slaveOk, which will read from an arbitrarily selected slave (which may have stale data): mongodb.org/display/DOCS/…. If the client is not OK with stale data, don't specify slaveOk and the query will go to the master. If the client cannot reach the master, then you'll get an error. I'm not sure exactly what that error will be.

NoSQL: What does it mean for MongoDB or BigTable to not always be "Available"

2 Answers