In Oozie Hive2 action, i am trying to load hive table from '.csv' files present in compressed '.zip' file. In order to read the files inside *.zip through Oozie Hive action workflow, Hive action provides 'archive' tag element. Just need to declare the Zip file in 'archive' tag element as below,
<archive>${ZipfilePath}#unzipFile</archive>
Reference after '#' in 'archive' element is the name of the temporary folder to read unzipped files. The .csv files inside the .zip can be read by referring the path 'unzipFile/.csv'
Issue is - Hive action unable to find the path referred in archive element. By default, Hive looks for unzip folder in "hdfs://nameservice1/user/hive/" location and error as
"Error: Error while compiling statement: FAILED: SemanticException Line 1:17 Invalid path ''unzipFile/file.csv'': No files matching path hdfs://nameservice1/user/hive/unzipFile/file.csv (state=42000,code=40000"
But, I was able to successfully test 'archive' tag using shell action and 'cat' the file as
cat unzipFile/file.csv
<archive>
instruction works like the Hadoop command-line-archives
option or the Hiveadd archives
command. It's meant to ship packaged libraries and/or configuration files. Not data files. – Samson Scharfrichter.gz
extension is recognized automatically) – Samson Scharfrichter