What's the right way to model many-to-many relationships in Cassandra (using 3.10 at the moment)?
From what answers I was able to find, denormalization into two relationship tables is suggested (as in here, for example: Modeling many-to-many relations in Cassandra 2 with CQL3). But there are problems with that on deletes, and those answers are so sparse they do not mention any details on that.
Suppose we have the following tables:
CREATE TABLE foo (
key UUID PRIMARY KEY,
content TEXT
)
CREATE TABLE bar (
key UUID PRIMARY KEY,
content TEXT
)
CREATE TABLE foo_bar (
foo UUID,
bar UUID,
PRIMARY KEY (foo, bar)
)
CREATE TABLE bar_foo (
bar UUID,
foo UUID,
PRIMARY KEY (bar, foo)
)
This seems to be the suggested answer. However, what happens when we try deleting a bar
record? Updating the bar_foo
table is easy:
DELETE FROM bar_foo WHERE bar = <bar_key>
However, an attempt to update the foo_bar
table fails:
DELETE FROM foo_bar WHERE bar = <bar_key>
with the following error:
InvalidRequest: Error from server: code=2200 [Invalid query] message="Some partition key parts are missing: foo"
This is because the primary key for the foo_bar
table is (foo, bar)
, and we specify only the second part of the primary key in the WHERE clause of the DELETE statement. Cassandra, apparently, requires a prefix of the primary key, and bar
without foo
is not a prefix.
Now, changing the primary key to (bar, foo)
won't help. After all, what would you do, then, if a foo
record gets deleted? And, in any case, the entire purpose of the foo_bar
table is to be able to find all bar
records corresponding to a given foo
record, and a SELECT statement also requires a prefix of the primary key in the WHERE clause (which must, by necessity, be foo
).
Can't do SELECT and then DELETE, either, since a SELECT by bar
won't work, it not being a prefix of the primary key.
So what to do with many-to-many relationships, when deletes are in the picture? Is this even possible to accomplish properly?