5
votes

For a complex real-time Apache Storm topology I need aggregates of my data (stored in CassandraDB) for some computation steps. So far the data is queried when needed with CQL (Cassandra Query Language) and aggregated in a Storm bolt. That is a bit slow, so we want to have the data needed for the aggregation cached. Two option are on the table:

  • Put the data needed in an indexed Ignite Cache and sliding-window-query it from Storm. In this case we would only need one Cache and use different queries, depending on the aggregation.
  • Put the data in Cassandras in-memory, off-heap cache.

Argument for Ignite: We only need one indexed cache, while we would need one Cassandra table for each aggregation, for fast access. (Also ACID, but obviously we already live with CAP, so not a strong argument for our architects.)

Argument for Cassandra: We don't need to introduce a new technology.

But: What about speed? How fast would an indexed Ignite cache be compared to an optimized (= own table for each query) in-memory Cassandra?

1

1 Answers

0
votes

I believe that in-memory indexed SQL in Ignite would be faster than Cassandra CQL queries. Apache Ignite is ANSI-99 SQL compatible, so you should be able to do all sorts of aggregations, joins, order by, group by, etc.

I will raise a point within the Ignite community to see if Cassandra CQL could be benchmarked against Ignite SQL. When done, will post the results here.