I have a requirement where I want to run Flume agent with spooling directory as source. After all the files from the spool directory is copied to HDFS(sink) I want the agent to stop as I know all the files are pushed to channel. Also I want to run this steps for different spooling directories each time and stop the agent when all files from the directory are marked as .COMPLETED. Is there any way to stop the flume agent?
2
votes
This is not the use case of flume
- Farooque
Ok.May be I can a level down. Let me explain to you what I am trying to achieve.I have a process of ETL. When a user gives input directory I first copy it to HDFS using put command and then run MapReduce job on it.What I was trying is to explore if there is efficient way of pushing data to HDFS than using put command.So I was trying to explore flume to achieve that.But the problem is each time the spooling directory will change as users may want to load data from different directory. Does this fit into the use case of Flume?If not is there any other component available for doing this?
- Aditya Calangutkar
3 Answers
0
votes
0
votes