We are using latest versions of Hive as well as Impala. Impala is being authenticated with LDAP and authorization is being done via Sentry. Hive access is not authorized via Sentry as yet. We are creating tables from Impala while the /user/hive/warehouse has group level ownership by "hive" group, hence, the folder permissions are impala:hive.
drwxrwx--T - impala hive 0 2015-08-24 21:16 /user/hive/warehouse/test1.db
drwxrwx--T - impala hive 0 2015-08-11 17:12 /user/hive/warehouse/test1.db/events_test_venus
As can be seen, above folders are owned by Impala and group is Hive, and are group-writable. The group “hive” has a user named “hive” as well:
[root@server ~]# groups hive
hive : hive impala data
[root@server ~]# grep hive /etc/group
hive:x:486:impala,hive,flasun,testuser,fastlane
But when I try to query the table created on the folder, it gives access errors:
[root@jupiter fastlane]# sudo -u hive hive
hive> select * from test1.events_test limit 1;
FAILED: SemanticException Unable to determine if hdfs://mycluster/user/hive/warehouse/test1.db/events_test_venus is encrypted: org.apache.hadoop.security.AccessControlException: Permission denied: user=hive, access=EXECUTE, inode="/user/hive/warehouse/test1.db":impala:hive:drwxrwx--T
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkTraverse(DefaultAuthorizationProvider.java:180)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:137)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6599)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6581)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6506)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getEZForPath(FSNamesystem.java:9141)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getEZForPath(NameNodeRpcServer.java:1582)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.getEZForPath(AuthorizationProviderProxyClientProtocol.java:926)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getEZForPath(ClientNamenodeProtocolServerSideTranslatorPB.java:1343)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038)
Any ideas how to counter it?? Basically, we are trying to exploit the fact that by giving the group level read and write permissions, we should be able to make any group user to create and use the tables created by the folder owner, but that does not seem to be possible. Is it because of the fact that Impala alone has the Sentry authorization which uses the user impersonalization while Hive, stand-alone doesn't?
Can someone please guide or confirm?
Thanks