Many-to-many in Cassandra 3

Question

What's the right way to model many-to-many relationships in Cassandra (using 3.10 at the moment)?

From what answers I was able to find, denormalization into two relationship tables is suggested (as in here, for example: Modeling many-to-many relations in Cassandra 2 with CQL3). But there are problems with that on deletes, and those answers are so sparse they do not mention any details on that.

Suppose we have the following tables:

CREATE TABLE foo (
  key UUID PRIMARY KEY,
  content TEXT
)
CREATE TABLE bar (
  key UUID PRIMARY KEY,
  content TEXT
)
CREATE TABLE foo_bar (
  foo UUID,
  bar UUID,
  PRIMARY KEY (foo, bar)
)
CREATE TABLE bar_foo (
  bar UUID,
  foo UUID,
  PRIMARY KEY (bar, foo)
)

This seems to be the suggested answer. However, what happens when we try deleting a bar record? Updating the bar_foo table is easy:

DELETE FROM bar_foo WHERE bar = <bar_key>

However, an attempt to update the foo_bar table fails:

DELETE FROM foo_bar WHERE bar = <bar_key>

with the following error:

InvalidRequest: Error from server: code=2200 [Invalid query] message="Some partition key parts are missing: foo"

This is because the primary key for the foo_bar table is (foo, bar), and we specify only the second part of the primary key in the WHERE clause of the DELETE statement. Cassandra, apparently, requires a prefix of the primary key, and bar without foo is not a prefix.

Now, changing the primary key to (bar, foo) won't help. After all, what would you do, then, if a foo record gets deleted? And, in any case, the entire purpose of the foo_bar table is to be able to find all bar records corresponding to a given foo record, and a SELECT statement also requires a prefix of the primary key in the WHERE clause (which must, by necessity, be foo).

Can't do SELECT and then DELETE, either, since a SELECT by bar won't work, it not being a prefix of the primary key.

So what to do with many-to-many relationships, when deletes are in the picture? Is this even possible to accomplish properly?

you can create materialized view and cassandra will take care of it... for bar_foo and foo_bar kind of scenario — undefined_variable

silverberry silverberry · Accepted Answer · 2017-06-14T18:37:36

CREATE TABLE foo (
  key UUID PRIMARY KEY,
  content TEXT
)
CREATE TABLE bar (
  key UUID PRIMARY KEY,
  content TEXT
)
CREATE TABLE foo_bar (
  foo UUID,
  bar UUID,
  PRIMARY KEY (foo, bar)
)
CREATE MATERIALIZED VIEW bar_foo AS
  SELECT bar, foo FROM foo_bar
    WHERE foo IS NOT NULL AND bar IS NOT NULL
  PRIMARY KEY (bar, foo)

Many-to-many in Cassandra 3

2 Answers