0
votes

I was wondering if there is a way to connect the output of Hive directly to a Hadoop streaming job without using intermediate files. I could use INSERT OVERWRITE DIRECTORY to create a temporary file and then kick off a hadoop streaming job, however I was wondering if there is a way to do this without a temporary file.

1

1 Answers

0
votes

There is streaming support in hive, check out "Streaming" on page https://cwiki.apache.org/confluence/display/Hive/GettingStarted

You can try to give a mr job jar, etc as the 'script' to stream query results to