The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-

Question

I'm trying to use structured streaming in spark against a local kafka topic.

First I start zookeeper and kafka:

write-host -foregroundcolor green "starting zookeeper..."
start "$KAFKA_ROOT\bin\windows\zookeeper-server-start.bat" "$KAFKA_ROOT\config\zookeeper.properties"

write-host -foregroundcolor green "starting kafka..."
start "$KAFKA_ROOT\bin\windows\kafka-server-start.bat" "$KAFKA_ROOT\config\server.properties"

Then I start the shell like so:

& "$SPARK_ROOT\bin\spark-shell.cmd" --packages "org.apache.spark:spark-sql-kafka-0-10_2.11:2.3.1"

Then I execute this scala command:

val ds = spark.readStream.format("kafka").option("kafka.bootstrap.servers", "localhost:9092").option("subscribe", "test").load()

Which should just work however I get this error:

org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-;

Every search result says something about using winutils to set permissions so I tried those answers and this is the output:

C:\>winutils chmod 777 \tmp\hive

C:\>winutils chmod 777 C:\tmp\hive

C:\>winutils ls C:\tmp\hive
drwxrwxrwx 1 DOMAIN\user DOMAIN\Domain Users 0 Jun 21 2018 C:\tmp\hive

Looks good but the same exception still occurs.

%HADOOP_HOME% is correctly set to D:\dependencies\hadoop and D:\dependencies\hadoop\bin\winutils.exe exists.

What am I missing here? I've gone through over a dozen posts here and there but the solution isn't working for me and I don't know how to debug it.

Unknown Unknown · Accepted Answer · 2018-06-25T18:37:19

So after pulling hairs out for two days, of course it was something simple. If you are calling C:\spark\bin\spark-shell from a working directory on another drive (eg. D:), then the permissions that you need to update are actually:

C:\Users\user>winutils ls D:\tmp\hive
d--------- 1 DOMAIN\user DOMAIN\Domain Users 0 Jun 25 2018 D:\tmp\hive

C:\Users\user>winutils chmod -R 777 D:\tmp\hive

C:\Users\user>winutils ls D:\tmp\hive
drwxrwxrwx 1 DOMAIN\user DOMAIN\Domain Users 0 Jun 25 2018 D:\tmp\hive

There is no command I could find, nor config I could see, or page on the environment config in the web UI that would should what the current hive directory is.

The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-

2 Answers