com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex is a custom Cassandra index type introduced by Datastax for Solr integration. My main question is: Can't CQL queries use these custom indexes? I've tried a few CQL queries with filters on the indexed columns but they always end with an RPC timeout.
My use case:
I have a table where the queries usually involve filters on multiple columns. Since Cassandra's native secondary indexes can only be defined in one column at a time (i.e. one index = one column) and only one index can be used by any given CQL query, I figured that I can't fulfill my application's read requirements using CQL. This is why I resorted to Solr for ALL read operations - because Solr can filter on multiple columns at once. This works fine for most cases; BUT I have two queries that turned out to be too heavy for Solr. Now, I want to try Spark because I've read about its amazing analytics capabilities. However, I stumbled across a blocker: Spark relies on CQL "WHERE" to filter out the data that will be loaded from Cassandra to Spark. And since CQL queries seemingly can't use Cql3SolrSecondaryIndex for reads, I don't know how else I can load my data into Spark. I'm aware that filtering on the Cassandra server side is not compulsory when loading data from Cassandra to Spark; but in my case, it is required because the table is too big (approx. 4 billion records spread across 6 nodes at RF=2). I tried to define a native Cassandra index in one of the columns that I intend to filter on, but Cassandra threw an error saying that an index already exists for that column (i.e. the Cql3SolrSecondaryIndex index).
As it appears to me now: DSE forces me to choose between Solr and Spark - if I include a column in the Solr core, a Cql3SolrSecondaryIndex index will be defined in that column and I cannot define a native Cassandra index into it anymore. Without a native Cassandra index, CQL queries cannot filter on that column. Without server-side CQL filtering, Spark would choke up trying to load all 4 billion rows and would likely trigger an OOM.
Is my impression correct? Is there a workaround?