0
votes

I try to launch a little app that only read a table from a cassandra database.

Launched app with spark-submmit:

  • /opt/spark/bin/spark-submit --class com.baitic.mcava.pruebacassandra.PruebaCassandraBBDD --master spark://192.168.1.105:7077 --executor-memory 1G /home/miren/NetBeansProjects/PruebaCassandra/target/original-PruebaCassandra-1.0-SNAPSHOT.jar --deploy-mode cluster

output before i launched it, only show the break point of output, the rest of it work well:

16/02/25 11:18:34 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0 Exception in thread "main" java.lang.NoClassDefFoundError: com/datastax/spark/connector/japi/CassandraJavaUtil at com.baitic.mcava.pruebacassandra.PruebaCassandraBBDD.main(PruebaCassandraBBDD.java:71) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: com.datastax.spark.connector.japi.CassandraJavaUtil at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 10 more

I create a maven java app, then i put the pom.xml with the necessary dependencies:

org.apache.spark spark-core_2.10 1.6.0

    <dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.6.0-M1</version>
    </dependency>


    <dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>3.0.0</version>
    </dependency>

    <dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.10</artifactId>
<version>1.5.0</version>
    </dependency>
</dependencies>

imports :

  • import org.apache.spark.api.java.JavaSparkContext;
  • import org.apache.commons.lang.StringUtils;
  • import com.datastax.spark.connector.japi.CassandraRow;
  • import org.apache.spark.SparkConf;
  • import org.apache.spark.api.java.JavaRDD;
  • importorg.apache.spark.api.java.function.Function;
  • import static com.datastax.spark.connector.japi.CassandraJavaUtil.javaFunctions;

....

code main:

JavaRDD cassandraRowsRDD = javaFunctions(sc).cassandraTable("ks", "sensor_readings") .map(new Function(){ @Override public String call(CassandraRow cassandraRow) throws Exception{ return cassandraRow.toString(); } });

System.out.println("Data as CassandraRows \n"+ StringUtils.join(cassandraRowsRDD.toArray(),"\n"));
2
I solve the problem...i compile de cassandra jar and not add through maven. I use sbt and i get the jar file and add the path in spark-default.sh. But the app still not work now i have a problem with serializable when i call the metod collet()Miren
I solve all the problem the app works perfect.....Miren

2 Answers

2
votes

Make sure you have the following in your import statements:

import static com.datastax.spark.connector.japi.CassandraJavaUtil.*;

If you are running this code with any custom classes the jar file path will need to be added to the spark-defaults.conf file with the parameter:

spark.driver.extraClassPath     /whateverpath/to/file/
0
votes

Try using --jars option in spark submit command. download cassandra jar and specify name of the cassandra jar in --jars option or specify it as extraClassPath in spark env file.