I'm reading rows from a CF using Hector with the default Cassandra cache settings. That means key cache is turned on. I am using jconsole to monitor the key cache hits.
But even after reading a single row (by a primary key) a 100 times, the cache hits does not increase. The row had been updated recently.
So when key cache is turned on what's the Cassandra read flow. Is it like this?
- The in-memory MemTable is checked for the row (could reside there after a recent insert/update).
- If not found in MemTable, the key cache is checked for the key.
- If key found (cache hit), one seek , else 2 seeks to get the row.
But using cassandra-cli and cassandra-jdbc (CQL), I get different results.
That is, even when I have updated the row recently, each read from the row results in a key cache hit. Say, I read it a 100 times, I get 100 hits.
Why this discrepancy?
Well i kind of figured this out myself, but would like someone to confirm..
It looks like updates result in just fetching the column to be updated into the MemTable.
So when i updated a row using hector, i had not updated all the columns. Just a column x and was reading the same column x for the read operation. So no cache hit as its already in MemTable.
While running CQL , i was just running a select * from cf which resulted in fetching the other column y too. The column y had not been updated , so i am assuming it wouldn't have been in memory (MemTable) , hence resulting in the cache hit.