0
votes

I have integrated milton webdav with hadoop hdfs and able to read/write files to the hdfs cluster.

I have also added the authorization part using linux file permissions so only authorized users can access the hdfs server, however, I am stuck at the authentication part.

It seems hadoop does not provide any in built authentication and the users are identified only through unix 'whoami', meaning I cannot enable password for the specific user. ref: http://hadoop.apache.org/common/docs/r1.0.3/hdfs_permissions_guide.html So even if I create a new user and set permissions for it, there is no way to identify whether the user is authenticate or not. Two users with the same username and different password have the access to the all the resources intended for that username.

I am wondering if there is any way to enable user authentication in hdfs (either intrinsic in any new hadoop release or using third party tool like kerbores etc.)

Edit:

Ok, I have checked and it seems that kerberos may be an option but I just want to know if there is any other alternative available for authentication.

Thanks, -chhavi

1

1 Answers

0
votes

Right now kerberos is the only supported "real" authentication protocol. The "simple" protocol is completely trusting the client's whois information.

To setup kerberos authentication I suggest this guide: https://ccp.cloudera.com/display/CDH4DOC/Configuring+Hadoop+Security+in+CDH4

msktutil is a nice tool for creating kerberos keytabs in linux: https://fuhm.net/software/msktutil/

When creating service principals, make sure you have correct DNS settings, i.e. if you have a server named "host1.yourdomain.com", and that resolves to IP 1.2.3.4, then that IP should in turn resolve back to host1.yourdomain.com.

Also note that kerberos Negotiate Authentication headers might be larger than Jetty's built-in header size limit, in that case you need to modify org.apache.hadoop.http.HttpServer and add ret.setHeaderBufferSize(16*1024); in createDefaultChannelConnector(). I had to.