0
votes

We're using cloudformation to automate the setup and tear down of several cassandra clusters we use for load testing. During this load test, we use opscenter to monitor our throughput. What I've found is that storing the opscenter data in our test's target cluster is skewing our node's data ownership information. As a result, I'd like to move opscenter and the agent data to it's own node. I have a single c3.4xl set up with a single cassandra instance and opscenter. I have the following configuration files.

opscenter server

/etc/opscenter/clusters/usergrid.conf

[cassandra]
seed_hosts = ec2-23-22-188-56.compute-1.amazonaws.com,ec2-54-163-164-41.compute-1.amazonaws.com,ec2-54-166-10-160.compute-1.amazonaws.com,ec2-54-166-219-212.compute-1.amazonaws.com,ec2-54-211-181-126.compute-1.amazonaws.com,ec2-54-82-161-157.compute-1.amazonaws.com,ec2-54-82-30-122.compute-1.amazonaws.com,ec2-54-83-98-182.compute-1.amazonaws.com,ec2-54-91-209-251.compute-1.amazonaws.com

[storage_cassandra]
seed_hosts = ec2-54-204-237-40.compute-1.amazonaws.com
api_port = 9160

datastax-agent cat /var/lib/datastax-agent/conf/address.yaml

stomp_interface: ec2-54-204-237-40.compute-1.amazonaws.com

However in the agents I see this in the logs in /var/log/datastax-agent/agent.log.

INFO [thrift-init] 2014-11-03 14:33:41,069 Connected to Cassandra cluster: usergrid
INFO [thrift-init] 2014-11-03 14:33:41,071 in execute with client org.apache.cassandra.thrift.Cassandra$Client@6deebf54


INFO [thrift-init] 2014-11-03 14:33:41,072 Using partitioner: org.apache.cassandra.dht.Murmur3Partitioner
 INFO [pdp-loader] 2014-11-03 14:33:41,072 Attempting to load stored metric values.
ERROR [pdp-loader] 2014-11-03 14:33:41,092 There was an error when attempting to load stored rollups.
me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:Keyspace 'OpsCenter' does not exist)
    at me.prettyprint.cassandra.connection.client.HThriftClient.getCassandra(HThriftClient.java:112)
    at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:251)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:132)
    at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSlice(KeyspaceServiceImpl.java:290)
    at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:53)
    at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery$1.doInKeyspace(ThriftSliceQuery.java:49)
    at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
    at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:101)
    at me.prettyprint.cassandra.model.thrift.ThriftSliceQuery.execute(ThriftSliceQuery.java:48)
    at clj_hector.core$execute_query.doInvoke(core.clj:201)
    at clojure.lang.RestFn.invoke(RestFn.java:423)
    at clj_hector.core$get_column_range.doInvoke(core.clj:298)
    at clojure.lang.RestFn.invoke(RestFn.java:587)
    at opsagent.cassandra$scan_pdps$fn__1051.invoke(cassandra.clj:182)
    at opsagent.cassandra$scan_pdps.invoke(cassandra.clj:181)
    at opsagent.cassandra$process_pdp_row$fn__1060.invoke(cassandra.clj:199)
    at opsagent.cassandra$process_pdp_row.invoke(cassandra.clj:197)
    at opsagent.cassandra$process_pdp_row.invoke(cassandra.clj:195)
    at opsagent.cassandra$load_pdps_with_retry$fn__1066.invoke(cassandra.clj:213)
    at opsagent.cassandra$load_pdps_with_retry.invoke(cassandra.clj:210)
    at opsagent.cassandra$setup_cassandra$f__388__auto____1094$fn__1095$f__388__auto____1102.invoke(cassandra.clj:357)
    at clojure.lang.AFn.run(AFn.java:24)
    at java.lang.Thread.run(Thread.java:745)
Caused by: InvalidRequestException(why:Keyspace 'OpsCenter' does not exist)
    at org.apache.cassandra.thrift.Cassandra$set_keyspace_result.read(Cassandra.java:5452)
    at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
    at org.apache.cassandra.thrift.Cassandra$Client.recv_set_keyspace(Cassandra.java:531)
    at org.apache.cassandra.thrift.Cassandra$Client.set_keyspace(Cassandra.java:518)
    at me.prettyprint.cassandra.connection.client.HThriftClient.getCassandra(HThriftClient.java:110)
    ... 22 more

Generally this would indicate that the client cannot connect to the storage Cassandra node. However, from the agent node, I can execute the following command.

cassandra-cli -h ec2-54-204-237-40.compute-1.amazonaws.com

Which I can then describe the keyspace, which works.

[default@unknown] describe  OpsCenter;

WARNING: CQL3 tables are intentionally omitted from 'describe' output.
See https://issues.apache.org/jira/browse/CASSANDRA-4377 for details.

Keyspace: OpsCenter:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
    Options: [us-east:1]
  Column Families:
    ColumnFamily: bestpractice_results
    "{"info": "OpsCenter management data.", "version": [5, 0, 1]}"
      Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
      Default column value validator: org.apache.cassandra.db.marshal.BytesType
      Cells sorted by: org.apache.cassandra.db.marshal.ReversedType(org.apache.cassandra.db.marshal.IntegerType)
      GC grace seconds: 0
      Compaction min/max thresholds: 4/32
      Read repair chance: 0.25
      DC Local Read repair chance: 0.0
      Populate IO Cache on flush: false
      Replicate on write: true
      Caching: KEYS_ONLY
      Bloom Filter FP chance: 0.01
      Built indexes: []
      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
      Compression Options:
        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
    ColumnFamily: events
    "{"info": "OpsCenter management data.", "version": [5, 0, 1]}"
      Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
      Default column value validator: org.apache.cassandra.db.marshal.BytesType
      Cells sorted by: org.apache.cassandra.db.marshal.UTF8Type
      GC grace seconds: 864000
      Compaction min/max thresholds: 4/32
      Read repair chance: 0.25
      DC Local Read repair chance: 0.0
      Populate IO Cache on flush: false
      Replicate on write: true
      Caching: KEYS_ONLY
      Bloom Filter FP chance: 0.01
      Built indexes: []
      Column Metadata:
        Column Name: success
          Validation Class: org.apache.cassandra.db.marshal.BooleanType
        Column Name: action
          Validation Class: org.apache.cassandra.db.marshal.LongType
        Column Name: level
          Validation Class: org.apache.cassandra.db.marshal.LongType
        Column Name: time
          Validation Class: org.apache.cassandra.db.marshal.LongType
      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
      Compression Options:
        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
    ColumnFamily: events_timeline
    "{"info": "OpsCenter management data.", "version": [5, 0, 1]}"
      Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
      Default column value validator: org.apache.cassandra.db.marshal.BytesType
      Cells sorted by: org.apache.cassandra.db.marshal.LongType
      GC grace seconds: 864000
      Compaction min/max thresholds: 4/32
      Read repair chance: 0.25
      DC Local Read repair chance: 0.0
      Populate IO Cache on flush: false
      Replicate on write: true
      Caching: KEYS_ONLY
      Bloom Filter FP chance: 0.01
      Built indexes: []
      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
      Compression Options:
        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
    ColumnFamily: pdps
    "{"info": "OpsCenter management data.", "version": [5, 0, 1]}"
      Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
      Default column value validator: org.apache.cassandra.db.marshal.BytesType
      Cells sorted by: org.apache.cassandra.db.marshal.UTF8Type
      GC grace seconds: 0
      Compaction min/max thresholds: 4/32
      Read repair chance: 0.25
      DC Local Read repair chance: 0.0
      Populate IO Cache on flush: false
      Replicate on write: true
      Caching: KEYS_ONLY
      Bloom Filter FP chance: 0.01
      Built indexes: []
      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
      Compression Options:
        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
    ColumnFamily: rollups300
    "{"info": "OpsCenter management data.", "version": [5, 0, 1]}"
      Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
      Default column value validator: org.apache.cassandra.db.marshal.BytesType
      Cells sorted by: org.apache.cassandra.db.marshal.IntegerType
      GC grace seconds: 0
      Compaction min/max thresholds: 4/32
      Read repair chance: 0.25
      DC Local Read repair chance: 0.0
      Populate IO Cache on flush: false
      Replicate on write: true
      Caching: KEYS_ONLY
      Bloom Filter FP chance: 0.01
      Built indexes: []
      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
      Compression Options:
        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
    ColumnFamily: rollups60
    "{"info": "OpsCenter management data.", "version": [5, 0, 1]}"
      Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
      Default column value validator: org.apache.cassandra.db.marshal.BytesType
      Cells sorted by: org.apache.cassandra.db.marshal.IntegerType
      GC grace seconds: 0
      Compaction min/max thresholds: 4/32
      Read repair chance: 0.25
      DC Local Read repair chance: 0.0
      Populate IO Cache on flush: false
      Replicate on write: true
      Caching: KEYS_ONLY
      Bloom Filter FP chance: 0.01
      Built indexes: []
      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
      Compression Options:
        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
    ColumnFamily: rollups7200
    "{"info": "OpsCenter management data.", "version": [5, 0, 1]}"
      Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
      Default column value validator: org.apache.cassandra.db.marshal.BytesType
      Cells sorted by: org.apache.cassandra.db.marshal.IntegerType
      GC grace seconds: 0
      Compaction min/max thresholds: 4/32
      Read repair chance: 0.25
      DC Local Read repair chance: 0.0
      Populate IO Cache on flush: false
      Replicate on write: true
      Caching: KEYS_ONLY
      Bloom Filter FP chance: 0.01
      Built indexes: []
      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
      Compression Options:
        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
    ColumnFamily: rollups86400
    "{"info": "OpsCenter management data.", "version": [5, 0, 1]}"
      Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type
      Default column value validator: org.apache.cassandra.db.marshal.BytesType
      Cells sorted by: org.apache.cassandra.db.marshal.IntegerType
      GC grace seconds: 0
      Compaction min/max thresholds: 4/32
      Read repair chance: 0.25
      DC Local Read repair chance: 0.0
      Populate IO Cache on flush: false
      Replicate on write: true
      Caching: KEYS_ONLY
      Bloom Filter FP chance: 0.01
      Built indexes: []
      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
      Compression Options:
        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
    ColumnFamily: settings
    "{"info": "OpsCenter management data.", "version": [5, 0, 1]}"
      Key Validation Class: org.apache.cassandra.db.marshal.BytesType
      Default column value validator: org.apache.cassandra.db.marshal.BytesType
      Cells sorted by: org.apache.cassandra.db.marshal.BytesType
      GC grace seconds: 864000
      Compaction min/max thresholds: 4/32
      Read repair chance: 1.0
      DC Local Read repair chance: 0.0
      Populate IO Cache on flush: false
      Replicate on write: true
      Caching: KEYS_ONLY
      Bloom Filter FP chance: 0.01
      Built indexes: []
      Compaction Strategy: org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy
      Compression Options:
        sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
[default@unknown]

This signals to me that the target Cassandra node is up and running, and has the keyspace + column families. It also indicates I don't have any sort of network firewall issues between the agent -> cassandra. I'm at a loss to explain why I'm receiving this error message. Am I still missing something in my configuration, or is this a bug?

Cassandra: 1.2.19 Opscenter: 5.0.1 DS Agent: 5.0.1

Any help would be greatly appreciated!

Thanks, Todd

UPDATE

Here is the agent log. Note my IP's have changed since this is a new environment. It appears that it's trying to connect to 10.81.168.96:9160, which is NOT the ec2 IP that's set in my settings of ec2-174-129-181-123.compute-1.amazonaws.com. Not sure where that's coming from, but it's not what is set on the opscenter server.

agent.log

https://gist.github.com/tnine/f509c120465eb80ade92

Sorry for the gist, but I've exceeded the character limit.

1
That configuration looks correct, so it may be a bug. Can you enable debug logging on the agent, restart and attach the log somewhere we can view? I wonder if the agent is still connecting to the local node instead of the dedicated/new cluster.mbulman
@mbulman Thanks for the reply. I took your advice and set an agent to log level debug. Note that b/c this is an ec2 ephemeral environment, my IP address have changed. Here is the log.tnine
Unfortunately, using a separate storage cluster is only supported when using DataStax Enterprise. Are you only using Apache Cassandra here?nickmbailey

1 Answers

0
votes

If you look though the logs of the opscenter instance you should see this:

exceptions.Exception: Storing data in a separate cluster is only supported when managing DSE clusters.

Though in the OpsCenter docs it has this:

[storage_cassandra] seed_hosts Used when using a different cluster for OpsCenter storage. A Cassandra seed node is used to determine the ring topology and obtain gossip information about the nodes in the cluster. This should be the same comma-delimited list of seed nodes as the one configured for your Cassandra or DataStax Enterprise cluster by the seeds property in the cassandra.yaml configuration file.

So you would think it's possible but apparently it's not.