Cassandra Range Query : Secondary Index vs Unindexed Colum

Question

I have seen that the best way to do range query on cassandra is by using CLUSTERING KEY. But I need to do some range query other than CLUSTERING KEY columns.

I read that we can do it on any column using ALLOW FILTERING. But is there any performance advantage if I create secondary index on that column ?

Performance is going to suck with both ALLOW FILTERING and a secondary index. If performance is something you really care about, you'll need to duplicate your data into a query table, with a primary key definition designed to support your range query. — Aaron

Alec Collier Alec Collier · Accepted Answer · 2017-08-04T01:15:01

Have a look at this link: https://www.datastax.com/dev/blog/allow-filtering-explained-2

The ALLOW FILTERING option allows you tell Cassandra that it is ok to perform in-memory filtering of the data once it loads rows from disk. So we can use this to search by a clustering column without specifying the previous clustering columns. But we can't use it on non-clustering columns.

See the below example schema from the blog. Use of ALLOW FILTERING doesn't allow us to filter by author column until we make it an index, which then doesn't need the ALLOW FILTERING option.

cqlsh:test> SELECT * FROM blogs WHERE author = 'john' ALLOW FILTERING;
Bad Request: No indexed columns present in by-columns clause with Equal operator
cqlsh:test>

cqlsh:test> CREATE INDEX authors ON blogs (author);

cqlsh:test> SELECT * FROM blogs WHERE author = 'john';
(0 rows)
cqlsh:test> SELECT * FROM blogs WHERE author = 'john' ALLOW FILTERING;
(0 rows)

Cassandra Range Query : Secondary Index vs Unindexed Colum

1 Answers