1
votes

Cassandra doc mentions that "nodetool snapshot" command takes snapshot of table data. However, I am also able to see schema.cql and manifest.json file in my snapshot directory where all snapshot files are generated.

Is this expected behavior? Also can I use this schema.cql file to restore the schema if needed?

My cassandra version

cqlsh> show version
[cqlsh 5.0.1 | Cassandra 3.0.9 | CQL spec 3.4.0 | Native protocol v4]

>nodetool version
ReleaseVersion: 3.0.9

EDIT:

  1. Is it mandatory to use cql file from snapshot while restoring data? Suppose I have create table cql stored somewhere else. Can I use that? I performed some tests. When I re-created table using cql from snapshot, ID in table name remains same "employee-42a71380966111e8870f97a01282a56a". However when I re-created table using my original cql, ID in table name changed. Can this be a problem and that's why we should use cql from snapshot? Note-: When I restored data from snapshot, it loaded fine in both above cases
  2. This cql file is for table. Can we get cql from snapshot to create keyspace?
  3. Does cql file gets generated only for user defined table? I can't see cql file getting generated for system tables..
2

2 Answers

3
votes

Yes, these files are necessary for restoring this particular table. And schema.cql captures the structure of table on the time of the snapshot because you need to restore snapshot to table with the same structure.

You can find more detailed description in the DataStax documentation.

Update after addition of more questions:

  1. Presence of schema in snapshot makes life easier - quite often the schema evolve, and you can use non-snapshot schema if you guarantee that schema will match to data in snapshot;
  2. nodetool snapshot generates only table's schemas
  3. It's better not to mess-up with system tables...

Here is detailed knowledge base article from DataStax support about backup/restore.

3
votes

Doc link you have given is for apache Cassandra, while the answer given is with reference to Datastax, I have done taking snaphosts and restore it back in apache-cassandra 2.0.4, It doesn't take any schema backup. All schemas need to be copied separately and need to be created manually in new cluster.