0
votes

Currently I am having an issue really need some help. We are trying kerberize our hadoop cluster including hive server2 and oozie. My oozie job spins off a java action in data node which tries to connect to kerberized hive server 2. There is no user’s kerberos keytab for authentication. So I can only use delegation token passed by oozie in the java action to connect to hive server 2. My question is: is there any way that I can use delegation token in a oozie java action to connect to hive server 2? If so, how can I do it through hive JDBC? Thanks Jary

1

1 Answers

6
votes

When using Oozie in a kerberized cluster...

  • for a "Hive" or "Pig" Action, you must configure <credentials> of type HCat
  • for a "Hive2" Action (just released with V4.2) you must configure <credentials> of type Hive2
  • for a "Java" action opening a custom JDBC connection to HiveServer2, I fear that Oozie cannot help -- unless there is an undocumented hack that would make it possible to reuse this new Hive2 credential?!?

Reference: Oozie documentation about Kerberos credentials

AFAIK you cannot use Hadoop delegation tokens with HiveServer2. HS2 uses Thrift for managing client connections, and Thrift supports Kerberos; but the Hadoop delegation tokens are something different (Kerberos was never intended for distributed computing, a workaround was needed)

What you can do is ship a full set of GSSAPI configuration, including a keytab, in your "Java" Action. It works, but there are a number of caveats:

  1. the Hadoop Auth library seems to be hard-wired on the local ticket cache in a very lame way; if you must connect to both HDFS and HiveServer2, then do HDFS first, because as soon as JDBC initiates its own ticket based on your custom conf, the Hadoop Auth will be broken
  2. Kerberos configuration is tricky, GSSAPI config is worse, and since these are security features the error messages are not very helpful, by design (would be bad taste to tell hackers why their intrusion attempt was rejected)
  3. use OpenJDK if possible; by default the Sun/Oracle JVM has limitations on cryptography (because of silly and obsolete US exports policies) so you must download 2 JARs with "unlimited strength" crypto settings to replace the default ones

Reference: another StackOverflow post that I found really helpful to set up "raw" Kerberos authentication when connecting to HiveServer2; plus a link about a very helpful "trace flag" for debugging your GSSAPI config e.g.

-Djava.security.debug=gssloginconfig,configfile,configparser,logincontext

Final warning: Kerberos is black magic. It will suck your soul away. More prosaically, it will have you lose many man-days to cryptic config issues, and team morale will suffer. We've been there.