10
votes

I'm having troubles understanding the replication factor in Cassandra. In the documentation it says that: "The total number of replicas across the cluster is often referred to as the replication factor". On the other hand, in the same documentation, it says that "NetworkTopologyStrategy allows you to specify how many replicas you want in each data center". So, if i have 2 datacenters with NetworkTopologyStrategy, a replication factor of 2 means i'll have 2 replicas per data center or 2 replicas overall in the cluster?

Thank you.

2

2 Answers

12
votes

When using the NetworkTopologyStrategy, you specify your replication factor on a per-data-center basis using strategy_options:{data-center-name}={rep-factor-value} rather than the global strategy_options:replication_factor={rep-factor-value}.

Here's a concrete example adapted from http://www.datastax.com/docs/1.0/references/cql/CREATE_KEYSPACE

CREATE KEYSPACE Excalibur WITH strategy_class = 'NetworkTopologyStrategy'
  AND strategy_options:DC1 = 2 AND strategy_options:DC2 = 2;

In that example, any given column would be stored on 4 nodes total, with 2 in each data center.

3
votes

Replication factor is basically number of replicas( additional copies ) you want to have.

One thing to remember is its always stated " Number of replicas should not be more than number of nodes". So i you have 2 nodes you are not supposed to have 3 as replication factor.