4
votes

I have read all these articles about how fast cassandra can be, for example single row read can take about 5ms.

So far i didn't care to much about my website speeds, but as the site grew bigger some pages started to require quite a few queries, for example one page requires to read 5 different tables and around 50 different rows, and so I have noticed that it takes from 0.7 sec to 2.0 secs which is really slow, so i took a closer look and found out that single query takes about 150ms.

The table that I'm testing is almost empty so size can't be an issue. I have installed APC and it did not help.

I am using PHPCassa and thrift comes along with this library.

Are these speeds normal, maybe php is just not fast enough? What could i do to improve this situation?

Note, I understand that running so many queries is to much and cassandra is optimized for writes not reads, but in some situations I can't find a way to put data in a single table/row.

EDIT I have just found out about optional C extension which should improve performance, and indeed it does, now single row read takes from 50ms to 100ms so that's a major improvement, thou still far away from those 5ms

EDIT2 Sorry for not updating my question with more information, but i have been very busy, and I have actually solved this problem, now 10 row reads from 4 different tables takes just 0.073158 s and average read time is just 0.005575 s so it's way more then I have expected to achieve. For those who are facing the same problem these are the things i would suggest to do:

  • Install optional C extension, steps to do that can be found here
  • Install APC
  • Make sure that right java version is installed, this could be causing slowdown
  • After installing all these things don't just restart apache, restart the whole server, I didn't do that at first and I only noticed this major speed improvement only after server restart
1
Actually a single row read in Cassandra can be much faster than 5ms; in fact I would say 5ms is slow for Cassandra. To help you we'll need to know your data model, query pattern, and cluster configuration.rs_atl
+1, the info that @rs_atl asked for is really needed to make any good suggestions here.Tyler Hobbs
Hi , had you have have any problem with cassndra yet ?Ata
Also how much row you had ?Ata
@Ata No so far it's working perfectly. I had few hundreds maybe, I'm trying to use as many wide rows as possible, but soon i will try to run few tests with a lot of rows to see what happens.Linas

1 Answers

1
votes

This still doesn't explain why a column family that is mostly empty performs worse than others. Next time you face that issue, you should give us how you use this table and what kind of query gave you bad performances.

Just a guess: Does this column family contains some frequently deleted data? Because the actual deletion for deleted (tombstoned) values takes a GcGracePeriod of 10 days by default.

So you might face some issues if you perform a lot of writes, reads and deletes on a lot of columns on the same keys.