2
votes

I'm playing with Cassandra(cql). When I do a SELECT it returns me an error:

Clustering column "domain" cannot be restricted (preceding column "timestamp" is restricted by a non-EQ relation).

Columns "domain" (type varchar) and "timestamp" (type timestamp) are Primary Keys.

WHEN I filter only by "timestamp" like this:

SELECT id, comment, language, location, section, domain, type, user, timestamp 
    FROM comments WHERE ts >= ? AND  ts <= ? ALLOW FILTERING;

Works fine, but when I try to filter including "domain" returns me that error:

SELECT id, comment, language, location, section, domain, type, user, timestamp 
    FROM comments WHERE domain=? AND ts >= ? AND  ts <= ? ALLOW FILTERING;

CREATE TABLE CQL:

CREATE TABLE table  ( 
id varchar,
type varchar,
section varchar,
comment text,
language varchar,
location text,
user int,
domain varchar,
timestamp timestamp,
PRIMARY KEY (id, timestamp, domain, user, section))
WITH CLUSTERING ORDER BY (timestamp DESC);
1
Have you looked at official documentation about WHERE limitations in Cassandra? It clearly states that range queries can be only at the last portion of the WHERE clause, and the error says that you are using ts in non restricted way. By the way the error says Clustering column, probably you have underestimate something in your table definition (eg partition key, clustering key). Post it and let's see...xmas79
This is my table definition. CREATE TABLE table ( id varchar, type varchar, section varchar, comment text, language varchar, location text, user int, domain varchar,timestamp timestamp, PRIMARY KEY (id,timestampdomain,user,section)) WITH CLUSTERING ORDER BY (ts DESC);Lexo Alonzo

1 Answers

0
votes

Important to note, but with Cassandra you need to take a query-centric approach to data modeling. Essentially, if this is the query you need to support:

SELECT id, comment, language, location, section, domain, type, user, timestamp 
    FROM comments WHERE domain=? AND timestamp >= ? AND  timestamp <= ?;

Then your PRIMARY KEY should be designed to support that:

PRIMARY KEY (domain, timestamp, user, section, id))

As you won't be querying on id, it doesn't make sense to have it as your partitioning key (the left-most key in the PRIMARY KEY definition). But you might still want it to be a part of the key, to help ensure uniqueness.

Notes:

  • Queries with ALLOW FILTERING are terrible for performance, and should NEVER be used on a production system.
  • This solution assumes that the cardinality of domain to id isn't infinite (which is probably is). To get ahead of the potential for unbound row growth on your partitions, you may want to "bucket" your partition key by compositing domain with something else.