1
votes

Im new to Cassandra and try to understand the datamodel. so i know how to insert if "bob" is following "james". i also know how to query to get a list of all people who follow "bob" and i know how to query to get a list of who "bob" is following.

My Question is, given the below, what does the query look like if i would like to find out if "bob" is following "james" ? (Yes or No)

Is this the right query?

SELECT * FROM followers WHERE username="bob" AND following="james" 

Do i need to set a second Index on FOLLOWING to be able to execute the above query?

    -- User storage
CREATE TABLE users (username text PRIMARY KEY, password text);

-- Users user is following
CREATE TABLE following (
    username text,
    followed text,
    PRIMARY KEY(username, followed)
);

-- Users who follow user
CREATE TABLE followers (
    username  text,
    following text,
    PRIMARY KEY(username, following)
);
1

1 Answers

1
votes

No need for a secondary index in this case. You can always test quick ideas like this using the cqlsh shell.

cqlsh> use ks;
cqlsh:ks> CREATE TABLE followers (
      ...     username  text,
      ...     following text,
      ...     PRIMARY KEY(username, following)
      ... );
cqlsh:ks> INSERT INTO followers (username, following ) VALUES ( 'bob', 'james' );
cqlsh:ks> SELECT * FROM followers WHERE username='bob' and following='james';

 username | following
----------+-----------
      bob |     james

The reason why you don't have to make an secondary index (nor should you if you would like perform this sort of query at scale) is because 'following' is specified as a clustering key. This means 'following' describes the layout of the data in the partition meaning we can filter on 'following' very quickly.

As an aside, if a query that is performed frequently requires a secondary index (Allow filtering) that is an indication that you should be rethinking your datamodel.