When integrating Solr with Cassandra using DSE software, adding a Solr core for a column family creates indexes on all the top level fields that are indexed in Solr schema. With the example CF and Solr schema outlined here, there are a bunch of indexes generated:
cassandra@cqlsh:demo1> desc demo;
CREATE TABLE demo1.demo (
id text PRIMARY KEY,
friends list<frozen<name>>,
magic_numbers frozen<tuple<int, int, int>>,
name frozen<name>,
solr_query text,
status text
[skipped]
CREATE CUSTOM INDEX demo1_demo_friends_index ON demo1.demo (friends) USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';
CREATE CUSTOM INDEX demo1_demo_magic_numbers_index ON demo1.demo (magic_numbers) USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';
CREATE CUSTOM INDEX demo1_demo_name_index ON demo1.demo (name) USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';
CREATE CUSTOM INDEX demo1_demo_solr_query_index ON demo1.demo (solr_query) USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';
CREATE CUSTOM INDEX demo1_demo_status_index ON demo1.demo (status) USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';
What I would like to understand is whether these indexes are just true Solr indexes, and just "show up" in Cassandra output because there is some integration that is going on, or they are actually "full Cassandra indexes" (for the lack of a better name, but I'm talking an index I can create using CREATE INDEX
CQL statement). The concern is if they are Cassandra indexes, then they will create a performance problem as the corresponding data is likely to have high cardinality.
If they are not "full Cassandra indexes", then I'm wondering why there are their issues creating Solr cores over frozen fields. I.e. if I create a column family of:
cassandra@cqlsh:demo1> CREATE TABLE demo2 (
"id" VARCHAR PRIMARY KEY,
"name" frozen<Name>,
"friends" frozen<list<Name>> );
Solr core creation (dsetool create_core
with generateResources=true
) fails with:
WARN [demo1.demo2 Index WorkPool scheduler thread-0] 2016-02-09 13:57:14,781 WorkPool.java:672 - Listener com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex$SSIIndexPoolListener@69442bb
6 failed for pool demo1.demo2 Index with exception: SolrCore 'demo1.demo2' is not available due to init failure: org.apache.cassandra.exceptions.InvalidRequestException: Frozen collections cur
rently only support full-collection indexes. For example, 'CREATE INDEX ON <table>(full(<columnName>))'.
org.apache.solr.common.SolrException: SolrCore 'demo1.demo2' is not available due to init failure: org.apache.cassandra.exceptions.InvalidRequestException: Frozen collections currently only su
pport full-collection indexes. For example, 'CREATE INDEX ON <table>(full(<columnName>))'.
at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:742) ~[solr-uber-with-auth_2.0-4.10.3.1.287.jar:4.10.3.1.287]
at com.datastax.bdp.search.solr.core.CassandraCoreContainer.getCore(CassandraCoreContainer.java:171) ~[dse-search-4.8.4.jar:4.8.4]
at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex.getCore(AbstractSolrSecondaryIndex.java:546) ~[dse-search-4.8.4.jar:4.8.4]
at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex$SSIIndexPoolListener.onBackPressure(AbstractSolrSecondaryIndex.java:1467) ~[dse-search-4.8.4.jar:4.8.4]
(this, of course, works just fine following the examples in the blog that uses the list of frozen fields, and not the frozen list of fields).