0
votes

I'm unable to sqoop export a hive table that's partitioned by timestamp.

I have a hive table that's partitioned by timestamp. The hdfs path it creates contains spaces which I think is causing issues with sqoop.

fs -ls 2013-01-28 16:31 /user/hive/warehouse/my_table/day=2013-01-28 00%3A00%3A00

The error on from sqoop export:

13/01/28 17:18:23 ERROR security.UserGroupInformation: PriviledgedActionException as:brandon (auth:SIMPLE) cause:java.io.FileNotFoundException: File does not exist: /user/hive/warehouse/my_table/day=2012-10-29 00%3A00%3A00 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1239) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1192) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1165) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1147) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:383) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:170) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44064) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)

If you do fs -ls /user/hive/warehouse/my_table/day=2013-01-28 00%3A00%3A00 ls: /user/hive/warehouse/my_table/day=2013-01-28': No such file or directory ls:00%3A00%3A00': No such file or directory

It works if you add quotes: brandon@prod-namenode-new:~$ fs -ls /user/hive/warehouse/my_table/day="2013-01-28 00%3A00%3A00" Found 114 items -rw-r--r-- 2 brandon supergroup 4845 2013-01-28 16:30 /user/hive/warehouse/my_table/day=2013-01-28%2000%253A00%253A00/000000_0 ...

3
Can you share entire Sqoop command that you're using?Jarek Jarcec Cecho

3 Answers

1
votes

You can try as "/user/hive/warehouse/my_table/day=2013-01-28*".

0
votes

So what you can do is:

Select all the data from hive and write it to a directory in HDFS

(using INSERT OVERWRITE DIRECTORY '..path..' select a.column_1, a.column_n FROM table a) ,

and in the sqoop command specify the directory location using --export-dir ..dir..

Hope this will help.

0
votes

Filenames with colon(:) are not supported as HDFS path as mention in these jira .But will work by converting it into Hex.But when sqoop is trying to read that path again it is converting it to colon(:) hence it cant able to find that path.I suggest to remove time part from your directory name and try again.Hope this answer your question.