Hive connection to Hadoop

Question

I have two instances of Hadoop running. I have no control over the first instance. I setup the second instance.

I also installed Hive in this machine. When I try to create a table using hive, hive is trying to connect to first instance of Hadoop and try to write data using this instance.

I want to change the default Hadoop port that Hive is looking for so that Hive uses the second instance of Hadoop to write data. I have set the HADOOP_HOME, path, etc. correctly in the .basrc and sourced it.

Is there a way to change the default Hadoop's port that hive is looking for to write the data?

yes. Additional info: two instances are working fine and I can integrate with other tools, keeping data separate, etc. The only thing is that I haven't figured out a way for hive to connect to the Hadoop instance that I created — kumaran
What do you mean you have two instances of Hadoop? Hadoop is a cluster technology, so are you running two independent NameNodes, DataNodes, YARN, etc? Have you researched ViewFS or federated Hadoop? — OneCricketeer

Czoo Czoo · Accepted Answer · 2018-07-24T15:31:09

Hadoop instance information for Hive are specified in /etc/hive/conf/hive-site.xml

Edit it manually if you deployed standalone Hadoop, with Ambari/Cloudera Manager if you used HDP/CDH, and then restart Hive service so it reloads the configuration and you should be good

Hive connection to Hadoop

2 Answers