0
votes

I have a cluster of (Apache) Cassandra nodes on the GCP and Python3 on one VM. Using the "cqlsh --cqlshrc" the query that I need can be run without any error. Inside the cqlshrc there is costume timestamp and increased connection-timeout.

[copy] DATETIMEFORMAT = %m/%d/%Y %H:%M:%S

[connection] request_timeout = 6000

I Also have the "cqlshrc" file in the "~/.cassandra/" folder so I can use it without passing it as a parameter. Now the Python script which is using "cassandra-driver" wants to talk to Cassandra and run some queries, but I get this error:

Traceback (most recent call last): File "queries.py", line 10, in query1() File "queries.py", line 6, in query1 rows = session.execute('SELECT count(*) FROM freeway_loopdata WHERE speed > 100 ALLOW FILTERING') File "cassandra/cluster.py", line 2345, in cassandra.cluster.Session.execute File "cassandra/cluster.py", line 4304, in cassandra.cluster.ResponseFuture.result cassandra.ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read] message="Operation failed - received 0 responses and 1 failures" info={'consistency': 'LOCAL_ONE', 'required_responses': 1, 'received_responses': 0, 'failures': 1}

Which is from not increasing the timeout. How can I pass the "cqlshrc" file inside the Python script as some parameter?

1
I have already used "Session.default_timeout=an integer" and "Session.request_timeout=an integer" to increase the timeout and failed.Damon
how are you declaring the connection to the database? what is the structure of the table? in cassandra.yaml, how are you declaring the listen_address value? using an ALLOW FILTERING statement is an antipattern.Carlos Monroy Nieblas
also, the way that the Python driver establishes the connection with the database is not related to cqlshrc.Carlos Monroy Nieblas
The way I establish the connection is: from cassandra.cluster import Cluster cluster = Cluster(['0.0.0.0'],port=9042) session = cluster.connect('cs588damon',wait_for_all_pools=True) session.request_timeout=60000 session.execute('USE cs588damon')Damon
This piece of code is on the same VM that I can query from. The same query using cqlsh on that VM works just fine.Damon

1 Answers

0
votes

The timeout that you're trying to increase is the client side timeout. But if you haven't increased it on the server side (bad idea anyway), then it doesn't help anyway.

Really, you're performing operation that Cassandra isn't optimized for - you're performing aggregation operation (count) on the arbitrary field, without restricting the partition key - this lead to situation that Cassandra needs to sift through all data on all nodes, and filter out only needed entries. This is an anti-pattern for Cassandra usage - such kind of queries need to be done via Spark, for example. I recommend to take some courses (DS201 & DS220 at least) on DataStax Academy to understand how Cassandra works & how to model data for it.