I have a Cassandra v1.2.5 performance degradation on reading data from a single row where only few or zero columns, but previously many different columns were added and deleted.
To test I do the following:
- Create a fresh column family
- Measure read speed of a row 100 times - 4.6 ms in average ms per read with zero column returned
- Add 500000 columns to the row
- Removed all 500000 from the row
- Measure read speed 100 times again - 282.4 ms in average ms per read with zero column returned
So after that reading became in ~70 times slower than before I added and removed 500000 columns.
Tries to compact, flush, repair - nothing helps. Speed was a bit increased up-to 208.7 ms
The only thing that helps to restore read performance is to remove the row completely. Writing and reading to other rows are still fast.
Why does this read speed degradation happen? And how to fix?