I was doing some benchmarking that consists of the following data flow:
Kafka --> Spark Streaming --> Cassandra --> Prestodb
Infrastructure: My spark streaming application runs on 4 executors (2 cores 4g of memory each). Each executor runs on a datanode wherein Cassandra is installed. 4 PrestoDB workers are also co-located in the datanodes. My cluster has 5 nodes, each of them with an Intel core i5, 32GB of DDR3 RAM, 500GB SSD and 1gigabit network.
Spark streaming application: My Spark streaming batch interval is 10s, my kafka producer produces 5000 events each 3 seconds. My streaming application writes to 2 Cassandra tables.
Context in which everything works fine: Everything runs fine, the streaming application is able to process the events and store them in Cassandra. The batch interval is adequate, ingestion rates, scheduling and processing delay stays almost constant for long periods of time.
Context where things get messy and confusing: In my benchmark, every hour I run 6 queries over the Cassandra tables. For the amount of time I am running these queries, the Spark streaming application is no longer able to sustain the write throughput and hangs when writing to Cassandra.
What I've done so far: I searched for this phenomenon in other web posts (including stackoverflow), but I was not able to find a similar phenomenon. The best I've seen was to increase the amount of memory available to Cassandra. Other aspects related to the fetch size of the connectors were also seen, but I don't know if this is a problem, since it only occurs when reading and writing simultaneously.
Question: Cassandra shouldn't lock writes while reading, right? What do you guys think is the source (or sources) of the problem that I need to solve? What configurations should I take into consideration?
I attached a print a print illustrating the job being stuck in the stage that writes to one of the Cassandra tables, when I run the benchmark with the 6 queries, as previously explained. If you need more information to trace the problem, please fell free to ask. I appreciate!
Thank you very much for your support,
Hope I placed the question in a proper manner,
Best regards,
Carlos