0
votes

I'm currently trying to connect to a Apache Cassandra database using Apache Spark (2.3.0,shell) using the Datastax driver (datastax:spark-cassandra-connector:2.3.0-s_2.11).

I'm using the --conf option at the command line and when I try to run a database query its erroring out saying that it cant open a native connection to 127.0.0.1:9042.

Step 1 (I'm running this command inside the folder where spark is.)

  • # ./bin/spark-shell --conf spark.cassandra-connection.host=localhost spark.cassandra-connection.native.port=32771 --packages datastax:spark-cassandra-connector:2.3.0-s_2.11

Step 2 (Im Running these steps in the scala> shell of Spark)

  • scala> import com.datastax.spark.connector._
  • scala> import org.apache.spark.sql.cassandra._
  • scala> val rdd = sc.cassandraTable("market", "markethistory")
  • scala> println(rdd.first)

Step 3 (It Errors out)

  • java.io.IOException: Failed to open native connection to Cassandra at {127.0.0.1}:9042 +stacktrace

  • Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: localhost/127.0.0.1:9042 (com.datastax.driver.core.exceptions.TransportException: [localhost/127.0.0.1:9042] Cannot connect)) +stacktrace

Additional notes:

Notice how it says port 9042 in the error.

I've also tried changing the host in the --conf option and that doesn't change the output of the error.

My main assumption would be that I need to specify the host and port in scala but I'm unsure how, and the datastax documentation is all about their special spark distro and it doesn't seem to match up.

Things I've tried:

  • spark.cassandra-connection.port=32771
  • spark.cassandra.connection.port=32771
  • spark.cassandra.connection.host=localhost

Thanks in advance.

1
The property is spark.cassandra.connection.host not spark.cassandra-connection.hostAlper t. Turker
It still throws the same error with spark.cassandra.connection.host=localhost and spark.connection.connection.port=32771dillon37
spark.cassandra.connection.port not spark.connection.connection.portAlper t. Turker
yeah I typed that in the console, I mistyped here. Apologies.dillon37

1 Answers

1
votes

The Answer was twofold;

  • The strings are indeed cassandra.connection not cassandra-connection
  • --conf has to be after --packages

Thanks to @user8371915 for the connection string difference.