0
votes

I have a 3 node cluster, with replication factor of 2 but data is getting replicated on all 3 nodes. This is how I create my keyspace:

CREATE KEYSPACE IF NOT EXISTS DEMO WITH replication = {'class':'SimpleStrategy', 'replication_factor':2};

What's missing here ?

3
How can you know it's "getting replicated on all 3 nodes"?Rocherlee

3 Answers

1
votes

Cassandra distributes data based on primary key of the row. Any table is generally distributed over all the machines and when you insert a row, it is inserted on "two machines" only (These two machines are not random and can be calculated with nodetool)

If you want to know more about how data is distributed by primary key, take a look at partitioners. Cassandra Partitioners

0
votes

Data is being distributed over 3 nodes, and each node holds 2 pieces of data: its own piece of data pertaining to its assigned partitions, and data belonging to its neighbor node.

0
votes

Try to execute getendpoints on any of the partition key in a table with in that keyspace. You will get the nodes list which holds that partition. In this case, you should get output as 2 nodes only.

$ nodetool getendpoints <keyspace> <table> key