1
votes

Can I prevent a keyspace from syncing over to another datacenter by NOT including the other datacenter in my keyspace replication definition? Apparently, this is not the case.

In my own test, I have set up two Kubernetes clusters in GCP, each serves as a Cassandra datacenter. Each k8s clusters have 3 nodes.

I set up datacenter DC-WEST first, and create a keyspace demo using this: CREATE KEYSPACE demo WITH replication = {‘class’: ‘NetworkTopologyStrategy’, ‘DC-WEST’ : 3};

Then I set up datacenter DC-EAST, without adding any use keyspaces.

To join the two data centers, I modify the CASSANDRA_SEEDS environment variable in the Cassandra StatefulSet YAML to include seeds nodes from both datacenters (I use host networking).

But after that, I notice the keyspace demo is synced over to DC-EAST, even though the keyspace only has DC-WEST in the replication.

cqlsh> select data_center from system.local
... ;

data_center
-------------
DC-EAST     <-- Note: this is from the DC-EAST datacenter

(1 rows)
cqlsh> desc keyspace demo

CREATE KEYSPACE demo WITH replication = {'class': 'NetworkTopologyStrategy', 'DC-WEST': '3'}  AND durable_writes = true;

So we see in DC-EAST the demo keyspace which should be replicated only on DC-WEST! What am I doing wrong?

2

2 Answers

3
votes

Cassandra replication strategies control where data is placed, but the actual schema (the existence of the table/datacenters/etc) is global.

If you create a keyspace that only lives in one DC, all other DCs will still see the keyspace in their schema, and will even make the directory structure on disk, though no data will be replicated to those hosts.

2
votes

You didn't specify how you deployed you Cassandra cluster in Kubernetes, but it looks like your nodes in DC-WEST may be configured to say that they are DC-EAST.

I would check the ConfigMap for the stateful set in DC-WEST. Maybe it has the DC-EAST value for cassandra-rackdc.properties(?). More info on the cassandra-rackdc.properties file here.