1
votes

I have a 3 nodes cluster, replicate_factor is 3 also. Consistency level is Write quorum, Read quorum. Traffic has three major steps

  • Create:
    • Rowkey: xxxx
    • Column: status=new, requests="xxxxx"
  • Update:
    • Rowkey: xxxx
    • Column: status=executing, requests="xxxxx"
  • Delete:
    • Rowkey: xxxx

When one node down, it can work according to consistency configuration, and the final status is all requests are finished and deleted.

So if running cassandra client to list the result (also set consistency quorum). It shows empty (only rowkey left), which is correct.

But if we start the dead node, the hinted handoff model will write back the data to this node. So there are lots of create, update, delete.

I don't know due to GC or compaction, the delete records on other two nodes seems not work, and if using cassandra client to list the data (also consistency quorum), the deleted row show again with column value. Due to the recovery node replay the history again.

And if using client to check the data several times, you can find the data is changed, seems hinted handoff replay operation, the deleted data show up and then disappear.

Is there a way to have this procedure invisible from external, until the hinted handoff finished?

What I want is final status synchronization, the temporary status is out of date and also incorrect, should never been seen from external.

Is it due to row delete instead of column delete? Or compaction?

1

1 Answers

1
votes

After check the log and configuration, I found it caused by two reason.

  1. GC grace seconds

    I using hector client to connect cassandra, and the default value of GC grace seconds for each column family is Zero! So when hinted handoff replay the temporary value, the tombstone on other two node is deleted by compaction. And then client will get the temporary value.

  2. Secondary index

    Even after fix the first problem, I can still get temporary result from cassandra client. And I use the command like "get my_cf where column_one='value' " to query the data, then the temporary value show again. But when I using the raw key to query the record again, it disappeared. And from client, we always using row key to get the data, and in this way, I didn't get the temporary value.

    So it seems the secondary index is not restricted by the consistency configuration.

    And when I change GC grace seconds to 10 days. our problem solved, but it is still a strange behavior when using index query.