We are running a DSE 3.2.2 cluster with cassandra and SolR enabled, 3 nodes and a replication factor of 2 in that particular cluster on virtual machines.
Data is written directly to c* using the a java client with default consistency level (recently changed to quorum).
The issue is that when querying an index the number of documents found varies a lot. Consequently, using the stats component on some of the numeric values also produces inconsistent results.
This is also the case if there is currently no data written. I have since manually triggered a nodetool repair on that column family, which triggered a re-index of the secondary indexes (which took some 5-6 hours). Afterwards, the results remain inconsistent.
In our use case, data that is out-of-date for some seconds is not an issue, so the workaround via session stickyness is not solving it for me. The problem is that data remains inconsistent for days after.
Next, a complete re-index with wiping the data is on the list, but will take some time to finish.
Update: Instead of a wipe and a re-index, I will upgrade to the latest version of C* and DSE, then run a repair, then run a re-index and report back asap (a few days at least).
Any suggestions or shared experience with query inconsistencies is greatly appreciated!
UPDATE #1
The query results remain still inconsistent. Every node seems to return a different number of documents for my query. The cluster has been upgraded to 4.5.1, sstables have been upgraded, repairs executed, and the entire SolR index has been rebuild using the full reindex trigger of the SolR GUI.
The data source table is still using the "old" compact storage option.
UPDATE #2
After the latest comments, I was not sure if further inserts had run in the meanwhile. So I made sure to hold off any inserts, ran nodetool repair, did a full rebuild of the index.
Queries seem to be OK! This seems to imply that the inconsistencies already re-appeared after my last attempt and are the result of some inserts after the rebuild of the indexes. I will try to confirm this be starting the inserts again.
UPDATE #3 So it looks like things are stable again! The upgrade seems to have resolved the issues initially, but due to problems with the changed default transport from tcp to http which we found in the log files, the inconsistencies remained. Switched back to http, repaired and reindexed two days ago. All inserts since without any issues. Thanks for the help! I will look into the tcp<->switch at a later time.