I am new in using Spark with Hadoop.
Current Scenario:
I have already configured Spark on 4 node cluster using pre-built binary "spark-1.5.2-bin-hadoop2.6".
There is also one Hadoop-2.4 cluster with 4 nodes present in my environment.
What I want:
I am planning to use Spark RDD processing using Hive HQL on the data present in hdfs in Hadoop cluster.
Queries
Do I need to reconfigure spark cluster using "spark-1.5.2-bin-hadoop2.4" binary or the current one will work.
Is it a good practice to work on Spark over Hadoop with Spark and Hadoop on two different clusters (but under same subnet in cloud).