I just read the DataStax post "Basic Rules of Cassandra Data Modeling" and, to sum up, we should modeling our database schema by our queries and not by our relations/objects. So, many tables can have the same duplicated data, for example users_by_email and users_by_username which both have the same data.
How can I handle the object update ?
For example the user edit his email, do I UPDATE both tables manually or only INSERT the object with all columns and don't care about previous data (which are still in my database, but with a wrong column value => email).
In case of UPDATE, how can I handle data synchronization ?
Currently, I'm doing it manually but is there a tool to help me ? Because, possibly, I can have 5 or 6 tables with different partition/clustering keys.
I heard that Hadoop can do it, or Apache Spark.