4
votes

Among other cases, this datastax post says that Compaction may not be a Good Option when Rows Are Write-Once:

If your rows are always written entirely at once and are never updated, they will naturally always be contained by a single SSTable when using size-tiered compaction. Thus, there's really nothing to gain from leveled compaction.

Also, in the talk The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla) slide 30 it says that Where LCS fits the best

Use cases needing very consistent read performance with much higher read to write ratio

Wide-partition data model with limited (or slow-growing) number of total partitions but a lot of updates and deletes, or fully TTL’ed dataset

I understand that if a row is updated or deleted frequently it can end up in several SSTables, hence this will impact in read performance. From Leveled Compaction in Apache Cassandra

Performance can be inconsistent because there are no guarantees as to how many sstables a row may be spread across: in the worst case, we could have columns from a given row in each sstable.

However, in a scenario where Rows Are Write-Once, this strategy does not represent a benefit too when reading by all rows of a partition key?

Because if I understood correctly, with this strategy the rows with the same partition key tend to be in the same SSTable, because merges SSTables that overlaps in contrast to Size Tiered Compaction that merges SSTables with similar size.

2

2 Answers

2
votes

I think that the response is that when the blog talks about a row, it's referring to a Thrift row and not a CQL row. (I'm not the only to confuse this terms)

When we say Thrift row we are talking about a partition (or a set of CQL rows with the same partition key). From Does CQL support dynamic columns / wide rows?

+--------------------------------------------------+-----------+
|                   Thrift term                    | CQL term  |
+--------------------------------------------------+-----------+
| row                                              | partition |
| column                                           | cell      |
| [cell name component or value]                   | column    |
| [group of cells with shared component prefixes]  | row       |
+--------------------------------------------------+-----------+

From Understanding How CQL3 Maps to Cassandra’s Internal Data Structure With the following schema

CREATE TABLE tweets (
        ... user text,
        ... time timestamp,
        ... tweet text,
        ... lat float,
        ... long float,
        ... PRIMARY KEY (user, time)
        ... );

(remember that the partition key is the first that appears in the primary key, in this case "user")

The following CQL rows

user         | time                     | lat    | long    | tweet
--------------+--------------------------+--------+---------+---------------------
 softwaredoug | 2013-07-13 08:21:54-0400 | 38.162 | -78.549 |  Having chest pain.
 softwaredoug | 2013-07-21 12:15:27-0400 | 38.093 | -78.573 |   Speedo self shot.
      jnbrymn | 2013-06-29 20:53:15-0400 | 38.092 | -78.453 | I like programming.
      jnbrymn | 2013-07-14 22:55:45-0400 | 38.073 | -78.659 |     Who likes cats?
      jnbrymn | 2013-07-24 06:23:54-0400 | 38.073 | -78.647 |  My coffee is cold.

Are internally persisted in Thrift like this

RowKey: softwaredoug
=> (column=2013-07-13 08:21:54-0400:, value=, timestamp=1374673155373000)
=> (column=2013-07-13 08:21:54-0400:lat, value=4218a5e3, timestamp=1374673155373000)
=> (column=2013-07-13 08:21:54-0400:long, value=c29d1917, timestamp=1374673155373000)
=> (column=2013-07-13 08:21:54-0400:tweet, value=486176696e67206368657374207061696e2e, timestamp=1374673155373000)
=> (column=2013-07-21 12:15:27-0400:, value=, timestamp=1374673155407000)
=> (column=2013-07-21 12:15:27-0400:lat, value=42185f3b, timestamp=1374673155407000)
=> (column=2013-07-21 12:15:27-0400:long, value=c29d2560, timestamp=1374673155407000)
=> (column=2013-07-21 12:15:27-0400:tweet, value=53706565646f2073656c662073686f742e, timestamp=1374673155407000)
-------------------
RowKey: jnbrymn
=> (column=2013-06-29 20:53:15-0400:, value=, timestamp=1374673155419000)
=> (column=2013-06-29 20:53:15-0400:lat, value=42185e35, timestamp=1374673155419000)
=> (column=2013-06-29 20:53:15-0400:long, value=c29ce7f0, timestamp=1374673155419000)
=> (column=2013-06-29 20:53:15-0400:tweet, value=49206c696b652070726f6772616d6d696e672e, timestamp=1374673155419000)
=> (column=2013-07-14 22:55:45-0400:, value=, timestamp=1374673155434000)
=> (column=2013-07-14 22:55:45-0400:lat, value=42184ac1, timestamp=1374673155434000)
=> (column=2013-07-14 22:55:45-0400:long, value=c29d5168, timestamp=1374673155434000)
=> (column=2013-07-14 22:55:45-0400:tweet, value=57686f206c696b657320636174733f, timestamp=1374673155434000)
=> (column=2013-07-24 06:23:54-0400:, value=, timestamp=1374673155485000)
=> (column=2013-07-24 06:23:54-0400:lat, value=42184ac1, timestamp=1374673155485000)
=> (column=2013-07-24 06:23:54-0400:long, value=c29d4b44, timestamp=1374673155485000)
=> (column=2013-07-24 06:23:54-0400:tweet, value=4d7920636f6666656520697320636f6c642e, timestamp=1374673155485000)

We clearly see that the 2 CQL rows with user softwaredoug are a single Thrift Row.

The case where a single CQL row corresponds to a single Thrift row (e.g. when the partition key == primary key) is what Deng and Svihla indicate like an anti-pattern use case for LCS

Heavy write with all unique partitions

However, I will mark dilsingi answer as correct because I think he already knew this relation.

2
votes

When the rows are written strictly once, there is no effect of choosing LeveledCompactionStrategy over SizeTieredCompactionStrategy, regarding read performance (there are other effects, e.g. LCS requires more IO)

Regarding the below comments from question

with this strategy the rows with the same partition key tend to be in the same SSTable, because merges SSTables that overlaps in contrast to Size Tiered Compaction that merges SSTables with similar size.

When a row with same partition key is written exactly once, then there is no scenario of merging SSTables, as its not spread out across different SSTables in the first place.

When we talk in terms of update, it need not be an existing column within that row being updated. There could be scenario where we add a complete new set of clustering column along with associated columns for an already existing partition key.

Here is a sample table

CREATE TABLE tablename(
   emailid text,
   sent-date date,
   column3 text,
   PRIMARY KEY (emailid,sent-date)
   )

Now for a given emailid (say [email protected]) a single partition key, there could be inserts at two or more times with different "sent-date". Though they are inserts (essentially upserts) to same partition key and hence LeveledCompaction would benefit here.

But assume the same table with just emailid as primary key and written exactly once. Then there is no advantage irrespective of how SSTables are compacted, be it SizeTieredCompactionStrategy or LeveledCompactionStrategy, as the row always would live on only one SSTable.