4
votes

I am using cassandra 1.2 with CQL3. I have three column families in my keyspace. When I query one of the column family(phones), it takes a long time to retrive. Here is my query

**select * from phones where phone_no in ('9038487582');**

Here is the tracing output for the query.

activity                                        | timestamp    | source      | source_elapsed
-------------------------------------------------+--------------+-------------+----------------
                              execute_cql3_query | 16:35:47,675 | 10.1.26.155 |              0
                               Parsing statement | 16:35:47,675 | 10.1.26.155 |             58
                              Peparing statement | 16:35:47,675 | 10.1.26.155 |            335
      Executing single-partition query on phones | 16:35:47,676 | 10.1.26.155 |           1069
                    Acquiring sstable references | 16:35:47,676 | 10.1.26.155 |           1097
                       Merging memtable contents | 16:35:47,676 | 10.1.26.155 |           1143
 Partition index lookup complete for sstable 822 | 16:35:47,676 | 10.1.26.155 |           1376
 Partition index lookup complete for sstable 533 | 16:35:47,686 | 10.1.26.155 |          10659
      Merging data from memtables and 2 sstables | 16:35:47,704 | 10.1.26.155 |          29192
              Read 1 live cells and 0 tombstoned | 16:35:47,704 | 10.1.26.155 |          29332
                                Request complete | 16:35:47,704 | 10.1.26.155 |          29601

I have only 1 replication factor on the keyspace. and have 3 node cluster. Phones have around 40 million rows and just two columns in each row. it comes back in 29ms, 15ms, 8 ms, 5 ms, 3 ms but it is not consistent. Can you guys give me any suggestions regarding what mistake I might be doing ? Also my usecase will have extremely low cache hit so caching keys is not a solution for me. Also, this is my column family definition.

CREATE TABLE phones (
  phone_no text PRIMARY KEY,
  ypids set<int>
) WITH
  bloom_filter_fp_chance=0.100000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  read_repair_chance=0.100000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  compaction={'class': 'LeveledCompactionStrategy'} AND
  compression={'sstable_compression': 'SnappyCompressor'};
3
What did you expect instead of the numbers you measured?mschenk74
I was expecting some where around 1ms to 2 ms. The above query took 29 ms instead.user2390862

3 Answers

4
votes

The index lookups are reasonably fast (probably the index file is being cached by the OS since it is accessed frequently); where you are losing all the time is in between that and the "merging data" step. What happens in between those is actually seeking to the data location in the sstable. (I've added a new trace entry for 1.2.6 to make that clear.)

This explains why sometimes it's fast, and sometimes not -- if your seek is uncontended, or better yet cached, then the query will be fast. Otherwise it will be slower.

I see several options that could help:

  1. Switch to Leveled compaction (http://www.datastax.com/dev/blog/when-to-use-leveled-compaction)
  2. Add more machines to get more iops by brute force
  3. Switch to SSD to get more iops by better hardware
  4. Add more RAM to make caching more effective at covering up the lack of iops

You'll note that only the first option doesn't include more or different hardware, so that's what I'd evaluate first. But the upside is limited: at best you'll reduce the number of sstables to 1.

3
votes

From the table that you provided above the majority of the query time is in index look up and merging SSTables. This is fairly common, I don't believe that you have done anything wrong.

The index lookup can be avoided by de-normalising the data. It is common practice with Cassandra to design ColumnFamilies around queries, rather than tables as is typical in relational systems. This then puts the burden at the point of writing the data, which is where Cassandra is strongest and of course risks data consistency due to data duplication and cassandra's natural tendency to give clients different views of data in order to optimise for clustered availability of the data.

The merging of sstables is Cassandra's Achilles Heel so to speak. Cassandra optimises write speed and reliability at the expense of both read time latency and latency consistency. It is perfectly normal with Cassandra to have 'slower' reads that keep varying in duration. To reduce this problem there are two approaches, the first is to avoid any kind of update or delete of data in the column family as that is what causes compactions later on. But even then that only delays sstable work as inserting will cause memtables to get flushed. So the other solution, which can be considered if the variation/duration is still too long, is to front Cassandra with a cache such as Memcache. This is the approach that Netflix documented here Netflix benchmarking of Cassandra.

For completeness I should add that the column family settings in Cassandra can be tweaked, benchmarked and then tweaked again to reduce this problem. But that will only get one so far as this problem is inherent in Cassandra's design. The settings that you should look at are the cache sizes, such as memtables and its overflow rate which is the point where a new SSTable will be created. Compression can also help, as that helps to squeeze more data into memory. Typically I expect un-indexed reads to take 2-10ms (avg 5ms) depending on the hardware and cluster activity within Amazon EC2 (which is the environment that I work in these days).

1
votes

Cassandra queries are usually very fast and usually take a constant time. If you do a query on a single column of your column family, how much time does it take to return, compared to doing a query on all columns? Some overhead is expected cause of more columns, but not much, like about 1 or 2ms.

If there is a big difference (more than double) between querying all and querying a single row, even if there is no much data in the column family, maybe your query is not constructed correctly. If you expect to have predictable columns in a row, you can try querying them together instead of querying with the wildcard. It might have a dramatic effect on the speed of the query.