Does Embedded flume agent need hadoop to function on cluster?

Question

I am trying to write embedded flume agent in my web service to transfer my logs to another hadoop cluster where my flume agent is running. To work with Embedded flume agent, do we need hadoop to be run in server where my web service is running.

bessbd bessbd · Accepted Answer · 2016-09-22T09:32:22

TLDR: I think, no.

Longer version: I haven't checked, but in the developer guide (https://flume.apache.org/FlumeDeveloperGuide.html#embedded-agent) it says

Note: The embedded agent has a dependency on hadoop-core.jar.

(https://flume.apache.org/FlumeDeveloperGuide.html#embedded-agent)

And in the User Guide (https://flume.apache.org/FlumeUserGuide.html#hdfs-sink), you can specify the HDFS path:

HDFS directory path (eg hdfs://namenode/flume/webdata/)

On the other hand, are you sure you want to work with the embedded agent instead of running Flume where you want to put the data and use HTTP Source for example? (https://flume.apache.org/FlumeUserGuide.html#http-source) (...or any other source you can send data to)

Does Embedded flume agent need hadoop to function on cluster?

1 Answers