we have 2 clusters one Map R and another our own. We want created new setup in our own hardware using the Map R data.
- I have copied all the orc files from the Map R cluster and followed the same folder structure
- Created a orc formatted table with location of #1
- then executed this command "MSCK REPAIR TABLE <>"
above steps passed without error, but when i query the partitions then job fails with below error
java.lang.IllegalArgumentException: Buffer size too small. size = 262144 needed = 4958903
at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:193)
at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:238)
Can some one tell me can we create HIVE ORC partition tables directly from the orc files?
My storage is Azure data lake.