2
votes

When integrating Solr with Cassandra using DSE software, adding a Solr core for a column family creates indexes on all the top level fields that are indexed in Solr schema. With the example CF and Solr schema outlined here, there are a bunch of indexes generated:

cassandra@cqlsh:demo1> desc demo;

CREATE TABLE demo1.demo (
    id text PRIMARY KEY,
    friends list<frozen<name>>,
    magic_numbers frozen<tuple<int, int, int>>,
    name frozen<name>,
    solr_query text,
    status text
[skipped]
CREATE CUSTOM INDEX demo1_demo_friends_index ON demo1.demo (friends) USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';
CREATE CUSTOM INDEX demo1_demo_magic_numbers_index ON demo1.demo (magic_numbers) USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';
CREATE CUSTOM INDEX demo1_demo_name_index ON demo1.demo (name) USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';
CREATE CUSTOM INDEX demo1_demo_solr_query_index ON demo1.demo (solr_query) USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';
CREATE CUSTOM INDEX demo1_demo_status_index ON demo1.demo (status) USING 'com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex';

What I would like to understand is whether these indexes are just true Solr indexes, and just "show up" in Cassandra output because there is some integration that is going on, or they are actually "full Cassandra indexes" (for the lack of a better name, but I'm talking an index I can create using CREATE INDEX CQL statement). The concern is if they are Cassandra indexes, then they will create a performance problem as the corresponding data is likely to have high cardinality.

If they are not "full Cassandra indexes", then I'm wondering why there are their issues creating Solr cores over frozen fields. I.e. if I create a column family of:

cassandra@cqlsh:demo1> CREATE TABLE demo2 ( 
  "id" VARCHAR PRIMARY KEY, 
  "name" frozen<Name>, 
 "friends" frozen<list<Name>> );

Solr core creation (dsetool create_core with generateResources=true) fails with:

WARN  [demo1.demo2 Index WorkPool scheduler thread-0] 2016-02-09 13:57:14,781  WorkPool.java:672 - Listener com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex$SSIIndexPoolListener@69442bb
6 failed for pool demo1.demo2 Index with exception: SolrCore 'demo1.demo2' is not available due to init failure: org.apache.cassandra.exceptions.InvalidRequestException: Frozen collections cur
rently only support full-collection indexes. For example, 'CREATE INDEX ON <table>(full(<columnName>))'.
org.apache.solr.common.SolrException: SolrCore 'demo1.demo2' is not available due to init failure: org.apache.cassandra.exceptions.InvalidRequestException: Frozen collections currently only su
pport full-collection indexes. For example, 'CREATE INDEX ON <table>(full(<columnName>))'.
        at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:742) ~[solr-uber-with-auth_2.0-4.10.3.1.287.jar:4.10.3.1.287]
        at com.datastax.bdp.search.solr.core.CassandraCoreContainer.getCore(CassandraCoreContainer.java:171) ~[dse-search-4.8.4.jar:4.8.4]
        at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex.getCore(AbstractSolrSecondaryIndex.java:546) ~[dse-search-4.8.4.jar:4.8.4]
        at com.datastax.bdp.search.solr.AbstractSolrSecondaryIndex$SSIIndexPoolListener.onBackPressure(AbstractSolrSecondaryIndex.java:1467) ~[dse-search-4.8.4.jar:4.8.4]

(this, of course, works just fine following the examples in the blog that uses the list of frozen fields, and not the frozen list of fields).

1

1 Answers

4
votes

What I would like to understand is whether these indexes are just true Solr indexes, and just "show up" in Cassandra output because there is some integration that is going on, or they are actually "full Cassandra indexes"

DSE Search indexes use Cassandra's secondary index API to provide a bridge between the Cassandra write path and the Solr document update machinery. They are not "full Cassandra indexes" in the sense you've mentioned in your question, even though you see multiple index entries in your table description. Each one of those entries represents a single indexed field in the same Solr core.

I'm wondering why there are their issues creating Solr cores over frozen fields.

Were you able to follow the blog post you mentioned to completion, or do you observe your error there as well? If you can follow it to the end without errors, perhaps we can isolate your problem using that as a baseline. (I'm assuming you've used dsetool create_core with generateResources=true to create the core in question.)