0
votes

TL;DR - So i'm having a connection issues from DataStax java cassandra driver to a DataStax cassandra cluster. It initially connects and performs well, then suddenly at some point it looses connection and does not reconnect - at this point all the queries fail.

More info -

So i'm running DataStax cassandra 2.1 cluster of 3 nodes on CentOS, I'm using DataStax cassandra driver 3.0.0. Everything worked great in the past few months, recently iv'e deployed a some code changes that included some schema changes (namely, adding columns to an existing table) and an increase in the number of queries made. Disconnections started at this point.

So when my app goes up it connects to the cluster and holds a single cluster (and session) object as shown in the code snippet below, at this point everything goes well. After a few hours i start receiving NoHostAvailableException for every query performed. At this point i have other servers performing well with the same cassandra cluster so i know there's nothing wrong with the cluster itself. When i restart my server everything works good again.

After investigating a little more, when the issue start occurring, i see that there's no active connection to neither node. Iv'e set up the driver to log on DEBUG level into a dedicated log file and waited for the issue to reoccur. A few hours later the issue occurred again, at some point the log file shows this message:

Connection[/10.4.116.91:9042-1, inFlight=2, closed=false] connection error
io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Adjusted frame length exceeds 268435456: 326843398 - discarded
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:418)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:245)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:962)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:485)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:399)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:371)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
        at java.lang.Thread.run(Thread.java:745)
Caused by: com.datastax.driver.core.exceptions.DriverInternalError: Adjusted frame length exceeds 268435456: 326843398 - discarded
        at com.datastax.driver.core.Frame$Decoder$DecoderForStreamIdSize.decode(Frame.java:239)
        at com.datastax.driver.core.Frame$Decoder.decode(Frame.java:205)
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:387)
        ... 11 common frames omitted

And right after that you see this:

Connection[/10.4.116.91:9042-1, inFlight=2, closed=false] connection error
io.netty.handler.codec.DecoderException: com.datastax.driver.core.exceptions.DriverInternalError: Adjusted frame length exceeds 268435456: 326843398 - discarded
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:418)
        at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:245)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:292)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:278)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:962)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:485)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:399)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:371)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112)
        at java.lang.Thread.run(Thread.java:745)
Caused by: com.datastax.driver.core.exceptions.DriverInternalError: Adjusted frame length exceeds 268435456: 326843398 - discarded
        at com.datastax.driver.core.Frame$Decoder$DecoderForStreamIdSize.decode(Frame.java:239)
        at com.datastax.driver.core.Frame$Decoder.decode(Frame.java:205)
        at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:387)
        ... 11 common frames omitted

From this point on you see only timeouts and retries but the connection doesn't get reestablished.

// CREATION OF CASSANDRA SESSION
PoolingOptions poolingOptions = new PoolingOptions();
poolingOptions
    .setPoolTimeoutMillis(0)
    .setMaxRequestsPerConnection(HostDistance.LOCAL, 32768)
    .setMaxRequestsPerConnection(HostDistance.REMOTE, 2000);
cluster = builder.withPoolingOptions(poolingOptions).build();
cluster.getConfiguration().getCodecRegistry().register(new EnumNameCodec<>(OnBoardingSlide.Type.class));
session = cluster.connect(Global.getServerConfig().CASSANDRA_KEYSPACE_NAME);
1
Are you doing insertions or just reading from cassandra?root
both. all queries fail.Aviv Carmi

1 Answers

2
votes

This might be a bug in java driver

If a cassandra node is configured with native_transport_max_frame_size_in_mb > 256 and the driver reads a frame larger than 256mb it throws an exception: This breaks the drivers ability to read subsequent packets since the Decoder for parsing frames is static

This has been fixed in a in 3.0.4, Here is the link for the details.

https://datastax-oss.atlassian.net/browse/JAVA-1292

Can you try upgrading your driver ?