1
votes

I’ve checked the Cygnus logs in the morning and see these channel errors. It seems that ckan-channel is full and can’t continue the transactions.

What do you suggest us for this error?

16/06/01 08:11:58 WARN http.HTTPSource: Error appending event to channel. Channel might be full. Consider increasing the channel capacity or make sure the sinks perform faster.
org.apache.flume.ChannelException: Unable to put batch on required channel: org.apache.flume.channel.MemoryChannel{name: ckan-channel}
        at org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:200)
        at org.apache.flume.source.http.HTTPSource$FlumeHTTPServlet.doPost(HTTPSource.java:201)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:725)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:814)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: org.apache.flume.ChannelException: Space for commit to queue couldn't be acquired Sinks are likely not keeping up with sources, or the buffer size is too tight
        at org.apache.flume.channel.MemoryChannel$MemoryTransaction.doCommit(MemoryChannel.java:128)
        at org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
        at org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:192)
        ... 16 mor16/06/01 08:11:58 WARN http.HTTPSource: Error appending event to channel. Channel might be full. Consider increasing the channel capacity or make sure the sinks perform faster.
org.apache.flume.ChannelException: Unable to put batch on required channel: org.apache.flume.channel.MemoryChannel{name: ckan-channel}
        at org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:200)
        at org.apache.flume.source.http.HTTPSource$FlumeHTTPServlet.doPost(HTTPSource.java:201)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:725)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:814)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:945)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:228)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: org.apache.flume.ChannelException: Space for commit to queue couldn't be acquired Sinks are likely not keeping up with sources, or the buffer size is too tight
        at org.apache.flume.channel.MemoryChannel$MemoryTransaction.doCommit(MemoryChannel.java:128)
        at org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
        at org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:192)
        ... 16 more

Also I see that hfds-channel/sink is not even active in the logs.

Is this because ckan-channel is giving errors so that hdfs-channel is not called for this?

1

1 Answers

1
votes

What you are experiencing is a well known log error from Flume. The log is descriptive itself: one of the channels internally communicating a source and a sink is getting full.

The reason? The source is generating events faster than the sink is able to deal with. There are a couple of things that can be done in this case:

  • Reduce the amount of notifications sent to Cygnus. This seems obvious, but you may say “that’s my data and I want to persist them all”. Well, we are talking about reducing the number of notifications, not the amount of data. For instance, instead of sending a notification for each attribute has changed about a certain entity, send all the attributes in the same notification. This can be achieved by changing the way you subscribe Cygnus to Orion. The reason behind reducing the number of notifications is that the channel capacity is measured in terms of Flume events (usually, there is a Flume event per notification), not in bytes. So, having bigger notifications will allow you to persist the same data, but consuming less capacity of the channel.
  • Increasing the capacity of the channel. This is only useful if the notification throughput is not regular. If regular, this will only delay the problem.
  • Using batching. This is widely explained in all the sinks supporting this feature (btw, most of them). For instance, have a look on the explanation on HDFS sink documentation, but the conclusions are the same for all the sinks.
  • Using parallelization. Instead of having a single sink, try adding more sinks running in parallel. This is achieved by connecting all these sinks to the same channel, but IMHO the best is having a channel for each sink and using our custom RoundRobinChannelSelector. This is a very advance feature, so ask me if the previous ones don’t work for you.

You may have a look on the Performance section.