Is it possible to set up a Flume Client-Collector-Structure using Avro Sink/Source in Cloudera-Quickstart-CDH-VM ? I know there's no practical use, however I wanted to understand how Flume works with Avro Files and how I can use them later with PIG etc..
It tried several configurations, however non of them worked. For me it seems that I need several agents, however there can only be one in the VM.
What I tried last:
agent.sources = reader avro-collection-source
agent.channels = memoryChannel memoryChannel2
agent.sinks = avro-forward-sink hdfs-sink
#Client
agent.sources.reader.type = exec
agent.sources.reader.command = tail -f /home/flume/avro/source.txt
agent.sources.reader.logStdErr = true
agent.sources.reader.restart = true
agent.sources.reader.channels = memoryChannel
agent.sinks.avro-forward-sink.type = avro
agent.sinks.avro-forward-sink.hostname = 127.0.0.1
agent.sinks.avro-forward-sink.port = 80
agent.sinks.avro-forward-sink.channel = memoryChannel
agent.channels.memoryChannel.type = memory
agent.channels.memoryChannel.capacity = 10000
agent.channels.memoryChannel.transactionCapacity = 100
# Collector
agent.sources.avro-collection-source.type = avro
agent.sources.avro-collection-source.bind = 127.0.0.1
agent.sources.avro-collection-source.port = 80
agent.sources.avro-collection-source.channels = memoryChannel2
agent.sinks.hdfs-sink.type = hdfs
agent.sinks.hdfs-sink.hdfs.path = /var/flume/avro
agent.sinks.hdfs-sink.channel = memoryChannel2
agent.channels.memoryChannel2.type = memory
agent.channels.memoryChannel2.capacity = 20000
agent.channels.memoryChannel2.transactionCapacity = 2000
Thanks for any advice!