4
votes

Intro

I'm writing an application in Python using a Cassandra 1.2 cluster (7 nodes, replication factor 3) and I'm accessing Cassandra from Python using the cql library (CQL 3.0).

The problem

The application is built in a way that when trying to run a cql statement against an unconfigured column family, it automatically creates the table and retries the cql statement. For example, if I try to run this:

SELECT * FROM table1

And table1 doesn't exists, then the application will run the corresponding CREATE TABLE for table1 and will retry the previous select. The problem is that, after the creation of the table the SELECT (the retry) fails with this error:

Request did not complete within rpc_timeout

The question

I assume the cluster needs some time to propagate the creation of the table or something like that? If I wait a few seconds between the creation of the table and the retry of the select statement everything works, but I want to know exactly why and if there is a better way of doing it. Perhaps making the create table wait for the changes to propagate before returning?, is there a way of doing that?

Thanks in advance

1

1 Answers

1
votes

I am assuming you are using cqlsh. Default consistency level for cqlsh is one meaning it will return after the first node completes but not necessarily before all nodes complete. If you read you aren't guaranteed to read from the node that has the completed table. You can check this by turning on tracing but that will affect performance.

You can enforce consistency which should make that create wait until the table is created on all nodes.

CREATE TABLE ... USING CONSISTENCY ALL