As a hadoop/Spark beginner, I have followed the tutorial in this website and successfully deployed a hadoop framework on my single machine(CentOS 6). Now I want to install Spark 1.2 too on the same machine, and let it work with the single-node Yarn cluster on my machine, which means execute Spark SQL on file that stored on hdfs on my single machine and output result to hdfs. I didn't find a good tutorial for this scenario online for the rest steps required.
What I did so far are:
(1) downloaded scala 2.9.3 from Scala official website and installed. "scala -version" command works!
(2) downloaded Spark 1.2.1(pre-built for Hadoop 2.4 or later) from Apache Spark website and untar-ed it already.
What to do next? How to change which config file in Spark directory? Can someone give a step by step tutorial? Especially how to configure the spark-env.sh. The more detailed the better. Thanks! (If you have questions on how I configured my hadoop and yarn, I followed exactly the steps listed in that website I mentioned before)