I am trying to create Wordcount Prject using Spark & Java on Eclipse in Cloudera through VMware.The Java Version is 1.7 and Spark version is 2.0.0. The code inside "JavaWordCount.java" class in the project is as follows:
package com.vishal.wc;
import scala.Tuple2;
import org.apache.hadoop.hive.ql.exec.spark.session.SparkSession;
import org.apache.spark.api.java.JavaRDD;
public class JavaWordCount {
public static final Pattern SPACE = Pattern.compile(" ");
public static void main(String[] args) throws Exception {
if(args.length < 2){
System.err.println("Usage: JavaWordCount <InputFile> <OutputFile>"); System.exit(1); }
SparkSession spark= SparkSession.builder().appName("JavaWordCount").getOrCreate(); JavaRDD<String> lines = spark.read().textFile(args[0]).javaRDD(); JavaRDD<String> words = lines.flatMap(new FlatMapFunction<String, String>(){
public Iterator<String> call(String s){
return Arrays.asList(s.split(" ")).iterator();
}
});
JavaPairRDD<String, Integer> ones = words.mapToPair(new PairFunction<String, String, Integer>(){
public tuple2<String, Integer> call(String s){
return new tuple2<>(s,1);
}
});
JavaPairRDD<String, Integer> counts = ones.reduceByKey(
new Function2<Integer, Integer, Integer>(){
public Integer call(Integer i1, Integer i2){
return i1 = i2;
}
});
counts.saveAsTextFile(args[1]);
spark.stop();
}
}
There were errors as there no Spark jars added. I added jars from Spark-2.0.0-bin-hadoop-2.7.tgz into the build path but still the errors are almost same. Errors are given below:
Description Resource Path Location Type
FlatMapFunction cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 26 Java Problem
Function2 cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 44 Java Problem
Iterator cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 28 Java Problem
JavaPairRDD cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 32 Java Problem
JavaPairRDD cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 42 Java Problem
PairFunction cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 32 Java Problem
The method builder() is undefined for the type SparkSession JavaWordCount.java /SparkProject/src/com/vishal/wc line 22 Java Problem
The method flatMap(FlatMapFunction<String,U>) in the type AbstractJavaRDDLike<String,JavaRDD<String>> is not applicable for the arguments (new FlatMapFunction<String,String>(){}) JavaWordCount.java /SparkProject/src/com/vishal/wc line 26 Java Problem
The method mapToPair(PairFunction<String,K2,V2>) in the type AbstractJavaRDDLike<String,JavaRDD<String>> is not applicable for the arguments (new PairFunction<String,String,Integer>(){}) JavaWordCount.java /SparkProject/src/com/vishal/wc line 32 Java Problem
The method read() is undefined for the type SparkSession JavaWordCount.java /SparkProject/src/com/vishal/wc line 24 Java Problem
The method stop() is undefined for the type SparkSession JavaWordCount.java /SparkProject/src/com/vishal/wc line 52 Java Problem
tuple2 cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 35 Java Problem
tuple2 cannot be resolved to a type JavaWordCount.java /SparkProject/src/com/vishal/wc line 37 Java Problem
Please help.