Cassandra - Pillar applied migrations sync issue

Question

Experiencing sync issues between different nodes in the same datacenter in Cassandra. The keyspace is set to a replication factor of 3 with NetworkTopology and has 3 nodes in the DC. Effectively making sure each node has a copy of the data. When node tool status is run, it shows all three nodes in the DC own 100% of the data.

Yet the applied_migrations column family in that keyspace is not in sync. This is strange because only a single column family is impacted within the keyspace. All the other column families are replicated fully among the three nodes. The test was done by doing a count of rows on each of the column families in the keyspaces.

keyspace_name | durable_writes | strategy_class                                       | strategy_options
--------------+----------------+------------------------------------------------------+----------------------------
core_service  |           True | org.apache.cassandra.locator.NetworkTopologyStrategy |          {"DC_DATA_1":"3"}


keyspace: core_service

Datacenter: DC_DATA_1
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address                      Load       Tokens  Owns (effective)  Host ID                               Rack
UN  host_ip_address_1_DC_DATA_1  3.75 MB    256     100.0%            3851106b                              RAC1
UN  host_ip_address_2_DC_DATA_1  3.72 MB    256     100.0%            d1201142                              RAC1
UN  host_ip_address_3_DC_DATA_1  3.72 MB    256     100.0%            81625495                              RAC1
Datacenter: DC_OPSCENTER_1
==========================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address                           Load       Tokens  Owns (effective)  Host ID                               Rack
UN  host_ip_address_4_DC_OPSCENTER_1  631.31 MB  256     0.0%              39e4f8af                              RAC1

Query: select count(*) from core_service.applied_migrations;

host_ip_address_1_DC_DATA_1 core_service applied_migrations

 count
-------
     1

(1 rows)
host_ip_address_2_DC_DATA_1 core_service applied_migrations

 count
-------
     2

(1 rows)
host_ip_address_3_DC_DATA_1 core_service applied_migrations

 count
-------
     2

(1 rows)
host_ip_address_4_DC_OPSCENTER_1 core_service applied_migrations

 count
-------
     2

(1 rows)

Similar error is received as described in the issue below. Because all the rows of data are not available, the migration script fails because it is trying to create an existing table: https://github.com/comeara/pillar/issues/25

phact phact · Accepted Answer · 2015-07-22T22:05:12

I require strong consistency

If you want to ensure that your reads are consistent you need to use the right consistency levels.

For RF3 the following are your options:

Write CL ALL and read with CL One or greater
Write CL Quorum and read CL Quorum. This is what's recommended by Magro who opened the issue you linked to. It's also the most common because you can loose one node and still read and write.
Write CL one but read CL ALL.

What does Cassandra do improve consistency

Cassandra's anti entropy mechanisms are:

Repair will ensure that your nodes are consistent. It gives you a consistency base line and for this reason it should be run as part of your maintenance operations. Run repair at least more often than your gc_grace_seconds in order to avoid zombie tombstones from coming back. DataStax OpsCenter has a Repair Service that automates this task.

Manually you can run:

nodetool repair in one node or

nodetool repair -pr in each of your nodes. The -pr option will ensure you only repair a node's primary ranges.

Read repair happens probabilistically (configurable at the table def). When you read a row, c* will notice if some of the replicas don't have the latest data and fix it.

Hints are collected by other nodes when a node is unavailable to take a write.

Manipulating c* Schemas

I noticed that the whole point of Pillar is "to automatically manage Cassandra schema as code". This is a dangerous notion--especially if Pillar is a distributed application (I don't know if it is). Because it may cause schema collisions that can leave a cluster in a wacky state.

Assuming that Pillar is not a distributed / multi-threaded system, you can ensure that you do not break schema by utilizing checkSchemaAgreement() before and after schema changes in the Java driver after schema modifications.

Long term

Cassandra schemas will be more robust and handle distributed updates. Watch (and vote for) CASSANDRA-9424