1
votes

I am loading Data from one Hive table to another using spark Sql. I've created sparksession with enableHiveSupport and I'm able to create table in hive using sparksql, but when I'm loading data from one hive table to another hive table using sparksql I'm getting permission issue:

Permission denied: user=anonymous,access=WRITE, path="hivepath".

I am running this using spark user but not able to understand why its taking anonymous as user instead of spark. Can anyone suggest how should I resolve this issue?

I'm using below code.

    sparksession.sql("insert overwrite into table dbname.tablename" select * from dbname.tablename").
3
why do you have three " in your query? - Lamanus

3 Answers

0
votes

If you're using spark, you need to set username in your spark context.

  System.setProperty("HADOOP_USER_NAME","newUserName")
  val spark = SparkSession
    .builder()
    .appName("SparkSessionApp")
    .master("local[*]")
    .getOrCreate()

  println(spark.sparkContext.sparkUser)
0
votes

First thing is you may try this for ananymous user

root@host:~# su - hdfs
hdfs@host:~$ hadoop fs -mkdir /user/anonymous
hdfs@host:~$ hadoop fs -chown anonymous /user/anonymous

In general

export HADOOP_USER_NAME=youruser before spark-submit will work. along with spark-submit configuration like below.

--conf "spark.yarn.appMasterEnv.HADOOP_USER_NAME=${HADDOP_USER_NAME}" \

alternatively you can try using sudo -su username spark-submit --class your class

see this

Note : This user name setting should be part of your initial cluster setup ideally if its done then no need to do all these above and its seemless.

I personally dont prefer user name hard coding in the code it should be from outside the spark job.

0
votes

To validate with which user you are running, run below command: -

    sc.sparkUser

It will show you the current user and then you can try setting new user as per the below code

And in scala, you can set the username by

    System.setProperty("HADOOP_USER_NAME","newUserName")