I installed cassandra from https://bitnami.com/stack/cassandra on a cloud machine. I cloned this machine so that I get 2 machines. One running cassandra server (1 node cassandra cluster) and other acting as client and issuing queries to the first one (server).
I used YCSB - https://github.com/brianfrankcooper/YCSB to perform the benchmark. What I observed is READ latency on server was very low few microseconds (around 50/100 us for 99th percentile and MAX) as observed using "nodetool cfhistograms <'db'> <'table'>" and "nodetool cfstats <'db'>" - most likely all data was coming from cache i.e. all sstables were in cache.
But end-to-end latency observed from client (other node) with YCSB benchmark tests was high - average latency = 2000 us. So I wonder why the end-to-end latency is so high 2000 us as opposed to 100 us (on server). Moreover network latencies are also low around 200 us (as seen using PING). I want cassandra server to respond as quickly/instantly as possible. Can somebody help?