0
votes

Below is my setting in flume.conf:

agent.sources = srcpv
agent.channels = chlpv
agent.sinks = hdfsSink
agent.sources.srcpv.type = exec
agent.sources.srcpv.command = tail -F /var/log/collector/web/pv.log
agent.sources.srcpv.channels = chlpv
agent.channels.chlpv.type = memory
agent.channels.chlpv.capacity = 1000000
agent.channels.chlpv.transactionCapacity = 100
agent.sinks.hdfsSink.type = hdfs
agent.sinks.hdfsSink.channel = chlpv
agent.sinks.hdfsSink.hdfs.path = hdfs://hadoop01:8020/flume/web/pv/
agent.sinks.hdfsSink.hdfs.filePrefix = pv-
agent.sinks.hdfsSink.hdfs.rollSize = 1024
agent.sinks.hdfsSink.hdfs.rollInterval= 30
agent.sinks.hdfsSink.hdfs.rollCount = 10

I'd like the file can rolled at certain size or interval, but the roll setting (rollSize, rollInterval, rollCount) cannot take effect, no hdfs file generated. And I got this error after several minutes:

[SinkRunner-PollingRunner-DefaultSinkProcessor] ERROR org.apache.flume.sink.hdfs.HDFSEventSink - process failed java.lang.OutOfMemoryError: GC overhead limit exceeded

Can anyone help to point out the appropriate HDFS Sink setting?

1

1 Answers

0
votes

It seems you are running out of Java Memory when running flume.

You can try adding the below line in the flume-env.sh file:

export JAVA_OPTS="-Xms100m -Xmx2g -Dcom.sun.management.jmxremote"

Increase the value Xmx as per your system configuration.

Hope this helps :)