I have a hadoop cluster with 4 nodes. And I create some hive tables from files stored in hdfs. Then I configure mysql as the hive metastore and copy the hive-site.xml file inside conf folder of spark.
To start the hadoop cluster I started the dfs and also the yarn.sh. Then I created the hive tables, and now Im executing some queries against hive tables from spark sql using hivecontext, like:
var hiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
query = hiveContext.sql("select * from customers");
query.show
My doubt is, in this case which cluster manager spark is using? Is the yarn? Because I started the yarn with ./start-yarn.sh
command? Or I need to configure something to be yarn and If i didnt it uses another cluster manager as deafult?
And in your opinion which cluster is better for this case? Or its indifferent?