0
votes

I found a similar post but it didn't help.

I've been working with Cassandra for a little while and now I'm trying to setup spark and spark-cassandra-connector. I'm using IntelliJ IDEA to do that (first time with IntelliJ IDEA and Scala too so, you get the picture)

My OS is Windows 10. This is what I've done:

Inside ../spark-2.4.5-bin-hadoop2.7/bin: spark-class.cmd org.apache.spark.deploy.master.Master

Inside ../spark-2.4.5-bin-hadoop2.7/bin: spark-class.cmd org.apache.spark.deploy.worker.Worker -c 1 spark://192.168.0.3:7077

build.gradle

apply plugin: 'scala'
apply plugin: 'idea'
apply plugin: 'eclipse'

repositories {
    mavenCentral()
}

idea {
    project {
        jdkName = '1.8'
        languageLevel = '1.8'
    }
}

dependencies {
    compile group: 'org.apache.spark', name: 'spark-core_2.11', version: '2.4.5'
    compile group: 'org.apache.spark', name: 'spark-sql_2.11', version: '2.4.5'
    compile group: 'org.scala-lang', name: 'scala-library', version: '2.11.11'
    compile group: 'com.datastax.spark', name: 'spark-cassandra-connector_2.11', version: '2.4.0'
}

configurations.all {
    resolutionStrategy {
        force 'com.google.guava:guava:12.0.1'
    }
}

compileScala.targetCompatibility = "1.8"
compileScala.sourceCompatibility = "1.8"

SparkModule.scala

package org.sentinel.spark_module

import org.apache.spark.{SparkConf, SparkContext}
import com.datastax.spark.connector._

object SentinelSparkModule {
  def main(args: Array[String]) {
    val conf = new SparkConf().set("spark.cassandra.connection.host", "127.0.0.1")
      .set("spark.cassandra.connection.port", "9042")
      .setAppName("Sentinel").setMaster("spark://192.168.0.3:7077")

    val sc = new SparkContext(conf)
    val rdd = sc.cassandraTable("keyspace", "table")
    val values = rdd.groupBy((CassandraRow) => {
      @throws[Exception]
      def call(row: Nothing) = CassandraRow.getString("column")
    }).take(10).foreach(println)    
  }
}

Even though the error occurs, I can still see the app running in http://localhost:8080/ until I stop the execution in the IDE. enter image description here

Excerpt of the full stack dump:

Exception in thread "main" java.io.IOException: Failed to open native connection to Cassandra at {127.0.0.1}:9042

Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /127.0.0.1:9042 (com.datastax.driver.core.exceptions.OperationTimedOutException: [/127.0.0.1:9042] Operation timed out))

Finally, even though it says it timed out, I am also querying Cassandra from my web app (node.js) as I'm coding this and the queries work fine. So, I don't know why it'd be a problem on Cassandra's part but, I guess it could be.

Thanks

EDIT:

I included compile group: 'com.datastax.cassandra', name: 'cassandra-driver-core', version: '3.0.0' and same error. (version compatibility table)

EDIT:

nodetool status shows:

Datacenter: datacenter1
========================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load        Tokens       Owns (effective)  Host ID                               Rack
UN  127.0.0.1  138.59 MiB  256          100.0%            77d808e6-5c57-494a-b6fb-7e73593dbb46  rack1

EDIT:

cqlsh 127.0.0.1 9042 shows:

WARNING: console codepage must be set to cp65001 to support utf-8 encoding on Windows platforms.
If you experience encoding problems, change your console codepage with 'chcp 65001' before starting cqlsh.

Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.4 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
WARNING: pyreadline dependency missing.  Install to enable tab completion.
cqlsh>
1
you shouldn't include cassandra driver explicitly - i'ts inside connector - Alex Ott
can you do nodetool status from your Cassandra cluster - Alex Ott
@AlexOtt you shouldn't include cassandra driver explicitly - i'ts inside connector. do you mean I should remove compile group: 'com.datastax.cassandra', name: 'cassandra-driver-core', version: '3.0.0'? also, I included nodetool status's output. thanks - Scaramouche
yes, you need to remove this dependency - everything is in the connector - Alex Ott
can you also try to do cqlsh 127.0.0.1 9042 ? - Alex Ott

1 Answers

0
votes

Is Cassandra also running on 192.168.0.3? Did you try changing spark.cassandra.connection.host to 192.168.0.3 instead? The reason you are seeing that error is because your Spark executor cannot connect to Cassandra at 127.0.0.1. I don't know anything about your setup, and you might have tried this already, but it could be that the solution is as simple as that.