Does Hadoop delegation for WebHDFS REST API has dependency with Kerberos SPNEGO?

Question

According to documentataion for WebHDFS REST API

https://hadoop.apache.org/docs/r2.7.3/hadoop-project-dist/hadoop-hdfs/WebHDFS.html#Delegation_Token_Operations

It is mentioned when security is on there is 2 mechanism

Authentication using Kerberos SPNEGO when security is on

Authentication using Hadoop delegation token when security is on

If i choose to use second option i.e Authentication using Hadoop delegation token when security is on

Does it mean it can run without Kerberos configuration in hadoop setup?

Do i have to setup Kerberos in my hadoop configuration in this case ?

Samson Scharfrichter Samson Scharfrichter · Accepted Answer · 2016-12-01T17:04:36

To put things in context: typically, you use SPNEGO when you start your HTTP session, then cache your credentials somehow to avoid the complex rounds of 3-way communication between client, server, and Kerberos KDC.

AFAIK, all the Hadoop UIs and REST APIs use a signed cookie after the initial SPNEGO, and it's completely transparent for you -- with the exception of WebHDFS.

Now, with WebHDFS, you have to manage your "credentials cache" explicitly:

start your session with a GET ?op=GETDELEGATIONTOKEN -- you don't present any credentials, therefore it will trigger a SPNEGO authentication, then generate a Hadoop delegation token server-side
retrieve that delegation token from the JSON result
use that token to present your session credentials explicitly in the following GET / POST / PUT, by appending &delegation=XXXXXX to all URLs

Bottom line: yes, you have to set up your Kerberos configuration on client side. The delegation token only allows you to minimize the authentication overhead.

Does Hadoop delegation for WebHDFS REST API has dependency with Kerberos SPNEGO?

1 Answers