Cassandra fails to start after schema changed

Question

I'm using Cassandra 3.11.6 on Centos 7 with a 3 node cluster, I ran some schema changes, drop tables/materialized views, alter tables, etc, after that one of the materialized views was failing with this error: org.apache.cassandra.schema.SchemaKeyspace$MissingColumns: No partition key columns found in schema table for my_keyspace.my_materialized_view.

I wanted to replace that materialized view with a table, with the same name though, that might be why it's failing

I ran nodetool describecluster and found the schema version was different, I tried to run a repair, it didn't work, I restarted the nodes, but they didn't start.

This is the error that is showing up in cassandra.log

ERROR [main] 2020-12-09 10:13:15,827 SchemaKeyspace.java:1017 - No partition columns found for table my_keyspace.my_materialized_view in system_schema.columns. This may be due to corruption or concurrent dropping and altering of a table. If this table is supposed to be dropped, run the following query to cleanup: "DELETE FROM system_schema.tables WHERE keyspace_name = 'my_keyspace' AND table_name = 'my_materialized_view'; DELETE FROM system_schema.columns WHERE keyspace_name = 'my_keyspace' AND table_name = 'my_materialized_view';" If the table is not supposed to be dropped, restore system_schema.columns sstables from backups. org.apache.cassandra.schema.SchemaKeyspace$MissingColumns: No partition key columns found in schema table for my_keyspace.my_materialized_view at org.apache.cassandra.schema.SchemaKeyspace.fetchColumns(SchemaKeyspace.java:1106) [apache-cassandra-3.11.6.jar:3.11.6] at org.apache.cassandra.schema.SchemaKeyspace.fetchTable(SchemaKeyspace.java:1046) [apache-cassandra-3.11.6.jar:3.11.6] at org.apache.cassandra.schema.SchemaKeyspace.fetchTables(SchemaKeyspace.java:1000) [apache-cassandra-3.11.6.jar:3.11.6] at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:959) [apache-cassandra-3.11.6.jar:3.11.6] at org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesWithout(SchemaKeyspace.java:936) [apache-cassandra-3.11.6.jar:3.11.6] at org.apache.cassandra.schema.SchemaKeyspace.fetchNonSystemKeyspaces(SchemaKeyspace.java:924) [apache-cassandra-3.11.6.jar:3.11.6] at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:92) [apache-cassandra-3.11.6.jar:3.11.6] at org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:82) [apache-cassandra-3.11.6.jar:3.11.6] at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:269) [apache-cassandra-3.11.6.jar:3.11.6] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:630) [apache-cassandra-3.11.6.jar:3.11.6] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:757) [apache-cassandra-3.11.6.jar:3.11.6]

I tried starting Cassandra with -Dcassandra.ignore_corrupted_schema_tables=true and it doesn't work

Erick Ramirez Erick Ramirez · Accepted Answer · 2020-12-16T05:17:59

It looks like you've either made concurrent updates to your schema or at least made multiple DDL statements close to each other to cause a schema disagreement in your cluster.

You are supposed to wait for a single DDL change to propagate through the cluster and check that the schema is in agreement before making the next DDL change to prevent disagreement.

My suggestion is to attempt to start the node where you were performing the schema changes and leave the other nodes down temporarily. Hopefully it's also a seed node.

Remove the ignore_corrupted_schema_tables and see if you can bring it back online. If it does, then proceed to the next node and watch the startup sequence (do a tail -f on the system.log). Keep going until all nodes are back online.

The issue is that depending on the state of the schema on each node, it will be difficult to "unscramble the egg". Good luck!

Cassandra fails to start after schema changed

1 Answers