1
votes

I am trying to load a csv file from hdfs using PigStorage, limit the output bt one record and dump.

my hdfs snapshot:

I am running a 2 node cluster with 1 master (NN & Sec NN)& 1 data node & job tracker on a slave machine.

My pig scripts running on data node.

using root user

grunt> x= load '/user/hadoop/input/myfile.csv' using PigStorage(',') as (colA:chararray);
grunt> y = limit x 1;                                                                                 
grunt> dump y;

console log:

> HadoopVersion   PigVersion      UserId  StartedAt               FinishedAt    
> Features
> 1.0.4           0.11.1          root    2013-09-26 17:35:18     2013-09-26 17:35:47     LIMIT
> 
> Failed!
> 
> Failed Jobs: JobId   Alias   Feature Message Outputs
> job_201309190323_0019   x,y             Message: Job failed! Error -
> JobCleanup Task Failure, Task: task_201309190323_0019_m_000002

I am getting permission denied error and log is

org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=hadoop, access=EXECUTE, inode="hadoop-root":root:supergroup:rwx------

which says that permission is denied when user "hadoop" is trying to execute on a folder "hadoop-root".

But my current user is root from where i am running pig & my namenode is running with user hadoop (superuser i hope)

**Why is the log showing user=hadoop instead of root. Am i doing anything wrong **

Snapshot of hdfs:

    [hadoop@hadoop-master ~]$ hadoop fs -ls /
    Warning: $HADOOP_HOME is deprecated.

    Found 2 items
    drwx------   - hadoop supergroup          0 2013-09-26 17:29 /tmp
    drwxr-xr-x   - hadoop supergroup          0 2013-09-26 14:20 /user
----------------------------------------------------------------------------------------
    [root@hadoop-master hadoop]# hadoop fs -ls /user
    Warning: $HADOOP_HOME is deprecated.

    Found 2 items
    drwxr-xr-x   - hadoop supergroup          0 2013-09-26 14:19 /user/hadoop
    drwxr-xr-x   - root   root                0 2013-09-26 14:33 /user/root
----------------------------------------------------------------------------------------    
    [hadoop@hadoop-master ~]$ hadoop fs -ls /tmp
    Warning: $HADOOP_HOME is deprecated.

    Found 15 items
    drwx------   - hadoop supergroup          0 2013-09-19 01:43 /tmp/hadoop-hadoop
    drwx------   - root   supergroup          0 2013-09-19 03:25 /tmp/hadoop-root
    drwxr-xr-x   - hadoop supergroup          0 2013-09-26 17:29 /tmp/temp-1036150440
    drwxr-xr-x   - root   supergroup          0 2013-09-26 17:27 /tmp/temp-1270545146
    drwx------   - root   supergroup          0 2013-09-26 14:51 /tmp/temp-1286962351
    drwx------   - hadoop supergroup          0 2013-09-26 14:12 /tmp/temp-1477800537
    drwx------   - hadoop supergroup          0 2013-09-26 15:25 /tmp/temp-1503376062
    drwx------   - root   supergroup          0 2013-09-26 14:09 /tmp/temp-282162612
    drwx------   - root   supergroup          0 2013-09-26 17:22 /tmp/temp-758240893
    drwx------   - root   supergroup          0 2013-09-26 15:00 /tmp/temp1153649785
    drwx------   - root   supergroup          0 2013-09-26 13:35 /tmp/temp1294190837
    drwx------   - root   supergroup          0 2013-09-26 13:42 /tmp/temp1469783962
    drwx------   - root   supergroup          0 2013-09-26 14:45 /tmp/temp2087720556
    drwx------   - hadoop supergroup          0 2013-09-26 14:29 /tmp/temp2116374858
    drwx------   - root   supergroup          0 2013-09-26 16:55 /tmp/temp299188455

I even tried to turn off the permission check (dfs.permissions in core-site.xml on both my nodes) as mentioned Permission denied at hdfs restarted all my hadoop services. But still no luck.

As per the log, I tried doing

hadoop fs -chmod -R 777 /tmp

as i identified that hadoop-root (which is not having permission as per above log) will be under /tmp dir in hdfs.

But i got different exception after changing the permission.

Message: java.io.IOException: The ownership/permissions on the staging directory hdfs://hadoop-master:9000/tmp/hadoop-root/mapred/staging/root/.staging is not as expected. It is owned by root and permissions are rwxrwxrwx. The directory must be owned by the submitter root or by root and permissions must be rwx------

So, i reverted the permission to hadoop fs -chmod -R 700 /tmp, and now the same old permission denied exception came back.

Could you please help.

2
any help to for the above problem?devThoughts
Try copying the input file to HDFS and then access it from HDFS in your pig code. Let us know how it goes.Ambarish Hazarnis
Well.. I have the input file in HDFS itself and loading the same file using pig.devThoughts

2 Answers

2
votes

Finally i could able to solve this problem.

I had my /tmp file in HDFS without proper permissions. I tried to change the permission to 1777 (sticky bit) when i have some files already in my hdfs. But that did not work.

As a trial & error, i took a backup of my hdfs using -copyToLocal to my local file system and removed all my files including /tmp folder.

I recreated /tmp directory this time with proper permissions.

hadoop fs -chmod 1777 /tmp

and i copied all my files again into hdfs using -put command.

This time my pig script which is on the first post worked like charm.

I checked the permission of /tmp/hadoop-root/mapred/staging it is set to what it should be.

drwxrwxrwx

Hope this helps anyone who is facing the same issue.

Cheers

0
votes
sudo su - hdfs

Once you're running as the 'hdfs' user then you should be able to run

hadoop fs -chmod -R 777 /tmp

All file permissions should then be changed.