0
votes

According to documentataion for WebHDFS REST API

https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Delegation_Token_Operations

It is mentioned when security is on there is 2 mechanism

Authentication using Kerberos SPNEGO when security is on

Authentication using Hadoop delegation token when security is on

If i choose to use second option i.e Authentication using Hadoop delegation token when security is on

Does it mean it can run without Kerberos configuration in hadoop setup?

Do i have to setup Kerberos in my hadoop configuration in this case ?

1

1 Answers

3
votes

To put things in context: typically, you use SPNEGO when you start your HTTP session, then cache your credentials somehow to avoid the complex rounds of 3-way communication between client, server, and Kerberos KDC.

AFAIK, all the Hadoop UIs and REST APIs use a signed cookie after the initial SPNEGO, and it's completely transparent for you -- with the exception of WebHDFS.

Now, with WebHDFS, you have to manage your "credentials cache" explicitly:

  1. start your session with a GET ?op=GETDELEGATIONTOKEN -- you don't present any credentials, therefore it will trigger a SPNEGO authentication, then generate a Hadoop delegation token server-side
  2. retrieve that delegation token from the JSON result
  3. use that token to present your session credentials explicitly in the following GET / POST / PUT, by appending &delegation=XXXXXX to all URLs

Bottom line: yes, you have to set up your Kerberos configuration on client side. The delegation token only allows you to minimize the authentication overhead.