I am new to Cassandra and I am struggling with some of the concepts. I see the advantage in having the same data duplicated across multiple tables (with different partition keys) to support queries, but how are ETL jobs typically set up?
Consider a scenario where the data from a single csv file has to be loaded to multiple tables.Would we run copy/sstableloader/cassandra-loader utility with the csv file multiple times, once for each table?
How is read consistency maintained when the data has been partially loaded to some of the tables but load script is still running? Clients connected to two different tables could potentially read two different values. Some online forums recommend using materialized views. Is that the only alternative?
Thanks!