0
votes

I'm looking a distributed, real-time data access tool. I've read that HBase is the HadoopSQL solution which is a Java clone of Google Big Table, but is more suited for batch jobs than real time access (and is slow because of all the read-write). I've also read that Cassandra is for "high availability".

Is my understanding of this correct? Is Cassandra better suited for a real-time database (that's distributed) than HBase or BigTable?

2
Note that Bigtable and HBase have similar data models, and Cassandra is partially derived from Bigtable, but Bigtable has higher performance than either HBase or Cassandra for both low-latency real-time operations as well as bulk read/write workloads. All of these databases are distributed and support high availability. - Misha Brukman

2 Answers

2
votes

Is Cassandra better suited for a real-time database (that's distributed) than HBase or BigTable?

Yes, Cassandra is more suited to an OLTP workload whereas HBase is more suitable for an OLAP workload, in general

1
votes

In terms of Bigtable... From the cloud bigtable docs

Cloud Bigtable is Google's NoSQL Big Data database service. It's the same database that powers many core Google services, including Search, Analytics, Maps, and Gmail.

Bigtable is designed to handle massive workloads at consistent low latency and high throughput, so it's a great choice for both operational and analytical applications, including IoT, user analytics, and financial data analysis.