0
votes

Our production Cluster Cassandra version is: [cqlsh 5.0.1 | Cassandra 3.11.3 | CQL spec 3.4.4 | Native protocol v4]

After a restart to the Cassandra nodes the Cassandra did not started and the following error was printed:

INFO  [main] 2018-08-22 15:30:04,082 CommitLogReader.java:105 - Skipping playback of empty log: CommitLog-6-1534951460541.log
DEBUG [main] 2018-08-22 15:30:04,082 CommitLogReader.java:273 - Reading /var/lib/cassandra/commitlog/CommitLog-6-1527416281330.log (CL version 6, messaging version 11, compression null)
INFO  [Service Thread] 2018-08-22 15:30:06,501 GCInspector.java:284 - ParNew GC in 216ms.  CMS Old Gen: 10906456 -> 31114600; Par Eden Space: 859045888 -> 0; Par Survivor Space: 29166056 -> 43187600
DEBUG [main] 2018-08-22 15:30:06,673 CommitLogReader.java:264 - Finished reading /var/lib/cassandra/commitlog/CommitLog-6-1527416281330.log
DEBUG [main] 2018-08-22 15:30:06,674 CommitLogReader.java:273 - Reading /var/lib/cassandra/commitlog/CommitLog-6-1527416281331.log (CL version 6, messaging version 11, compression null)
DEBUG [main] 2018-08-22 15:30:08,009 CommitLogReader.java:264 - Finished reading /var/lib/cassandra/commitlog/CommitLog-6-1527416281331.log
DEBUG [main] 2018-08-22 15:30:08,009 CommitLogReader.java:273 - Reading /var/lib/cassandra/commitlog/CommitLog-6-1527416281332.log (CL version 6, messaging version 11, compression null)
ERROR [main] 2018-08-22 15:30:08,610 JVMStabilityInspector.java:102 - Exiting due to error while processing commit log during initialization.
org.apache.cassandra.db.commitlog.CommitLogReadHandler$CommitLogReadException: Unexpected error deserializing mutation; saved to /tmp/mutation1296995018372874453dat.  This may be caused by replaying a mutation against a table with the same name but incompatible schema.  Exception follows: java.io.IOError: java.io.EOFException: EOF after 45 bytes out of 33554712
    at org.apache.cassandra.db.commitlog.CommitLogReader.readMutation(CommitLogReader.java:471) [apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.commitlog.CommitLogReader.readSection(CommitLogReader.java:404) [apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.commitlog.CommitLogReader.readCommitLogSegment(CommitLogReader.java:251) [apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.commitlog.CommitLogReader.readAllFiles(CommitLogReader.java:132) [apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.commitlog.CommitLogReplayer.replayFiles(CommitLogReplayer.java:137) [apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.commitlog.CommitLog.recoverFiles(CommitLog.java:177) [apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.db.commitlog.CommitLog.recoverSegmentsOnDisk(CommitLog.java:158) [apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:324) [apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:602) [apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:691) [apache-cassandra-3.11.3.jar:3.11.3]

After moving out the CommitLogs (which caused data loss) the Cassandra did started but queries on certain tables failed with

ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed - received 0 responses and 1 failures" info={'failures': 1, 'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}

And the system.log:

WARN  [ReadStage-2] 2018-08-26 11:04:34,091 AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread Thread[ReadStage-2,10,main]: {}
java.lang.RuntimeException: org.apache.cassandra.io.sstable.CorruptSSTableException: Corrupted: /var/lib/cassandra/data/policy/rule-83f10050a91f11e890846d2c86545d91/mc-52-big-Data.db
    at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2601) ~[apache-cassandra-3.11.3.jar:3.11.3]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_171]
    at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162) ~[apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134) [apache-cassandra-3.11.3.jar:3.11.3]
    at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109) [apache-cassandra-3.11.3.jar:3.11.3]
    at java.lang.Thread.run(Thread.java:748) [na:1.8.0_171]

After investigation I pretty sure that I managed to reproduce the bug with the following steps:

  1. Create a brand new Cassandra docker container: docker stop cassandra-prod ; docker rm cassandra-prod; docker run -d --name cassandra-prod -p 9042:9042 cassandra:3.11.3; docker exec -it cassandra-prod bash
  2. Create a keyspace
  3. Create a UDT
  4. Create a table with a column type that is the UDT that was created previously
  5. Insert multiple rows to the table
  6. Drop the UDT column
  7. Restart Cassandra: docker stop cassandra-prod ; docker start cassandra-prod; docker exec -it cassandra-prod bash
  8. Perform a SELECT query on that table


DROP KEYSPACE IF EXISTS my_ks;
CREATE KEYSPACE my_ks WITH replication = {'class':'SimpleStrategy', 'replication_factor':1};
CREATE TYPE my_ks.my_type(column1 text);

CREATE TABLE my_ks.my_table (
  id uuid primary key,
  mt my_type
);

INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});
INSERT INTO my_ks.my_table(id, mt) VALUES(uuid(), {column1 : 'value1'});

ALTER TABLE my_ks.my_table DROP mt;

The following steps reproduces the CorruptSSTableException but not the CommitLogReadHandler$CommitLogReadException.
BTW, on Cassandra 3.11.1 the error was not reproduced using the above-mentioned steps.

1
The issue seems to be similar to this - CASSANDRA-12582. However, I tried to reproduce it using the steps that you provided. I was able to reproduce it in 3.9, but not in 3.10, 3.11.1 or 3.11.3. This behavior seems to be in line with what is described in the jira. Could you provide some more info in order to be able to reproduce it?Horia
Also, a workaround might be to remove to remove the offending commit log, re-add the column and then put back the commit log.Horia
@Horia Did you restarted the Cassandra after those 5 steps and tried to select from the table? If you didn't, sorry for not being clear. I will edit the questionYossi Shasha
Yes, I restarted it (and I also did not run nodetool flush). As stated before, I successfully reproduced it on 3.9Horia
OK, so I had some trouble reproducing it again. But after doing a repeating inserts I think I did. It's a bit ugly but seems like working.Yossi Shasha

1 Answers

2
votes

In Cassandra 4.0 the deletion (drop) of a non-frozen user defined type column will be forbidden. The thrown error is

InvalidRequest: Error from server: code=2200 [Invalid query] message="Cannot drop non-frozen column mt of user type my_type"

I tested this on the trunk. Unfortunately this is not available yet for earlier versions (< 4.0).

Using frozen for your udt column should solve the issue (I tested in 3.11.3) (but altering the type for a column is not possible).

CREATE TABLE my_ks.my_table (
  id uuid primary key,
  mt frozen<my_type>
);

There is also CASSANDRA-14673 opened for this issue.