I'm following the getting started guides on the Apache sites for Hadoop and Hive. I have Hadoop configured to run in Pseudo-Distributed Operation. I'm able to run hdfs operations, start beeline, create tables, insert data, and so on. The only problem is that I expect the databases to be stored at /user/hive/warehouse on HDFS, but instead they are created on the local file system at the same path.
Here are my versions and configs:
hadoop@precise64:/data/hadoop-2.8.2/logs$ hadoop version
Hadoop 2.8.2
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r 66c47f2a01ad9637879e95f80c41f798373828fb
Compiled by jdu on 2017-10-19T20:39Z
Compiled with protoc 2.5.0
From source with checksum dce55e5afe30c210816b39b631a53b1d
This command was run using /data/hadoop-2.8.2/share/hadoop/common/hadoop-common-2.8.2.jar
hadoop@precise64:/data/hadoop-2.8.2/logs$ hive --version
Hive 2.3.2
Git git://stakiar-MBP.local/Users/stakiar/Desktop/scratch-space/apache-hive -r 857a9fd8ad725a53bd95c1b2d6612f9b1155f44d
Compiled by stakiar on Thu Nov 9 09:11:39 PST 2017
From source with checksum dc38920061a4eb32c4d15ebd5429ac8a
hadoop@precise64:/data/hadoop-2.8.2/logs$ cat $HADOOP_HOME/etc/hadoop/yarn-site.xml
<?xml version="1.0"?>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
hadoop@precise64:/data/hadoop-2.8.2/logs$ cat $HADOOP_HOME/etc/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.proxyuser.hive.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hive.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
</configuration>
hadoop@precise64:/data/hadoop-2.8.2/logs$ cat $HADOOP_HOME/etc/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///home/hadoop/hadoopinfra/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///home/hadoop/hadoopinfra/hdfs/datanode</value>
</property>
</configuration>
hadoop@precise64:/data/apache-hive-2.3.2-bin/conf$ cat hive-site.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=metastore_db;create=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.EmbeddedDriver</value>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/home/hadoop/tmp</value>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/home/hadoop/tmp/${hive.session.id}_resources</value>
</property>
<property>
<name>hive.querylog.location</name>
<value>/home/hadoop/tmp</value>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/home/hadoop/tmp/operation_logs</value>
</property>
</configuration>
<property><name>javax.jdo.option.ConnectionDriverName</name>value>org.apache.derby.jdbc.EmbeddedDriver</value></property>
The hive-site was based on the template, which is large. I've uploaded it to: dropbox.com/s/ake7my6wtjemiqu/hive-site.xml?dl=0 – Paul Jackson