Mongodb and Cassandra data storing mechanism

Question

I have been reading about MongoDB and Cassandra. MongoDB is a master/slave where as Cassandra is masterless (all nodes are equal). My doubt is about how the data is stored in these both.

Let's say a user is writing a request to MongoDB(a cluster with master and different slaves each in a separate machine). This means the master will decide(or through some application implementation) to which slave this update should be written to . That is same data will not be available in all the nodes in MongoDB. Each node size may vary. Am i right ? Also when queried will the master know to which node this request should be sent ?

In the case of cassandra, the same data will be written to all the nodes ie) effectively if one node size is 10GB, then the other nodes size is also 10GB. Because if only this is the case, then when one node fails, the user will not lose any data by querying in another node. Am i right here ? If I am right, the same data is available in all the nodes, then what is the advantage of using map/reduce function in Cassandra ? If I am wrong, then how availability is maintained in Cassandra since the same data will not be available in the other node ?

I was searching in stackoverflow about MongoDB vs cassandra and have read about some 10 posts but my questions could not be cleared with the answers in those posts. Please clear my doubts and If I had assumed wrongly, also correct me.

Mark Hillick Mark Hillick · Accepted Answer · 2012-05-30T14:37:43

Regarding MongoDB, yep you're right, there is only one primary.

Any secondary can become primary as long as everything is in sync as this will mean the secondary has all the data. Each node doesn't have to be the same on-disk size and this can vary depending on when the replication was done, however, they do have the same data (as long as they're in sync).

I don't know much about Cassandra, sorry!

Mongodb and Cassandra data storing mechanism

3 Answers