Folks,
We were trying to evaluate CASSANDRA for one of the production application. We had few basic queries which we would like to understand before going forward.
WRITE :
Cassandra uses consistent hashing mechanism to distribute key evenly across nodes. So some key will be available on some Cassandra node.
We further understood that there will be internal SSTTable structure created to store this data within the node.
READ :
While performing a read client will send request to any Cassandra node cluster and based on consistent hashing Cassandra will determine where the key is located on which node.
Following things are not clear.
1) How many SSTTables are created for given key space/column family on a node ( is it some fix number or only 1)
2) Cassandra document describes that there is some broom filter(alternative to standard hashing) which is used to determine whether given key is present in the SSTtable or not ( What if there are 1000 SSTtables there will be 1000 bloom filter which will be checked to determine whether key is present or not.)