I'm trying to use structured streaming in spark against a local kafka topic.
First I start zookeeper and kafka:
write-host -foregroundcolor green "starting zookeeper..."
start "$KAFKA_ROOT\bin\windows\zookeeper-server-start.bat" "$KAFKA_ROOT\config\zookeeper.properties"
write-host -foregroundcolor green "starting kafka..."
start "$KAFKA_ROOT\bin\windows\kafka-server-start.bat" "$KAFKA_ROOT\config\server.properties"
Then I start the shell like so:
& "$SPARK_ROOT\bin\spark-shell.cmd" --packages "org.apache.spark:spark-sql-kafka-0-10_2.11:2.3.1"
Then I execute this scala command:
val ds = spark.readStream.format("kafka").option("kafka.bootstrap.servers", "localhost:9092").option("subscribe", "test").load()
Which should just work however I get this error:
org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-;
Every search result says something about using winutils to set permissions so I tried those answers and this is the output:
C:\>winutils chmod 777 \tmp\hive
C:\>winutils chmod 777 C:\tmp\hive
C:\>winutils ls C:\tmp\hive
drwxrwxrwx 1 DOMAIN\user DOMAIN\Domain Users 0 Jun 21 2018 C:\tmp\hive
Looks good but the same exception still occurs.
%HADOOP_HOME% is correctly set to D:\dependencies\hadoop and D:\dependencies\hadoop\bin\winutils.exe exists.
What am I missing here? I've gone through over a dozen posts here and there but the solution isn't working for me and I don't know how to debug it.