Cassandra - CQL queries [COUNT, ORDER_BY, GROUP_BY ]

Question

I'm new in Cassandra and I'm trying to learn a bit more of how this DB engine works (specially the CQL part ) and compare it with Mysql.

With this in mind I was trying some query's, but there is one particular query that I can't figure out. From what I could read it seams that it's not possible to do this query in Cassandra, but I would like to know for sure if there is somework around that.

Imagine the following table [Customer] with PRIMARY_KEY = id:

id, name, city, country, email 
01, Jhon, NY, USA, jhon@
02, Mary, DC, USA, mary@
03, Smith, L, UK, smith@
.....

I want to get a listing that shows me how many customers I have per country and ORDER BY DESC.

In mySQL it would be something like

SELECT COUNT(Id), country 
FROM customer
GROUP BY country
ORDER BY COUNT(Id) DESC

But in Cassandra (CQL) it seems that I can't do GROUP BY of columns that aren't PRIMARY_KEY (like the case of "country" ), is there anyway arround this ???

Although CQL resembles the SQL, it's not the same... To perform the things like aggregations, sorting, etc., you need to model your table correct way. I recommend to take DS220 course about data modelling in Cassandra: academy.datastax.com/resources/ds220 — Alex Ott

Hossein Hossein · Accepted Answer · 2018-12-03T02:35:36

You need to define a secondary index on "country". Secondary indexes are used to query a table using a column that is not normally query table.

For ORDER BY you define clustering keys on 'id'.Clustering keys are responsible for sorting data within a partition.

Cassandra - CQL queries [COUNT, ORDER_BY, GROUP_BY ]

2 Answers