0
votes

I am running a multi-DC Cassandra (open-source, not DSE) cluster in AWS, where one DC (us-west-2) is set up for analytics and the other (us-east) is the transactional store. I'm using NetworkTopologyStrategy with the EC2 snitch, and a consistency level of LOCAL_ONE in my Hadoop config. Hadoop can read from Cassandra without issue, but attempting to write produces a timeout exception.

Running nodetool status shows the DCs are properly configured:

Datacenter: us-west-2
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Owns   Host ID                               Token                                    Rack
UN  x.x.x.x       1.01 GB     9.9%   9e7f4393-7ac9-4559-b3ff-de48be50016f  -9127921345534057723                     2a
UN  x.x.x.x       1001.16 MB  11.4%  d0760383-c3dd-474c-9261-239b71dba3f1  -9221279003374097975                     2b
UN  x.x.x.x       1.05 GB     11.7%  3f09fbf5-0d85-4283-9009-0ec0e29223c0  -9140104347498952504                     2c
Datacenter: us-east
===================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address         Load       Owns   Host ID                               Token                                    Rack
UN  x.x.x.x       1.1 GB     11.3%  5bbd2de4-e1d2-4a17-9f40-034f60b35954  -9061054426204373981                     1b
UN  x.x.x.x       1.15 GB    11.5%  e34c590e-6176-45b2-a8f9-18b4a9a80032  -9216519687724118609                     1c
UN  x.x.x.x       1.18 GB    10.9%  fa0b0a1a-f156-40fc-a267-970d1eb9cddb  -9207673937991303291                     1a
UN  x.x.x.x       1.46 GB    10.7%  b18ae406-c9ec-42b7-a365-b0c6e2fe582f  -9206671929961171506                     1a
UN  x.x.x.x       1.13 GB    11.4%  1ac9c1c5-55ad-4048-b1ba-3b9768933ecc  -9146100851344467112                     1c
UN  x.x.x.x       1.53 GB    11.2%  dad665bb-68d9-4811-b421-f33333261867  -9178920986366339267                     1b

Stack trace using ColumnFamilyOutputFormat:

java.io.IOException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection timed out
    at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter$RangeClient.run(ColumnFamilyRecordWriter.java:224)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection timed out
    at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
    at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
    at org.apache.cassandra.thrift.TFramedTransportFactory.openTransport(TFramedTransportFactory.java:41)
    at org.apache.cassandra.hadoop.AbstractColumnFamilyOutputFormat.createAuthenticatedClient(AbstractColumnFamilyOutputFormat.java:123)
    at org.apache.cassandra.hadoop.ColumnFamilyRecordWriter$RangeClient.run(ColumnFamilyRecordWriter.java:215)
Caused by: java.net.ConnectException: Connection timed out
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:579)
    at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
    ... 4 more

... and using CqlOutputFormat:

java.io.IOException: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection timed out
    at org.apache.cassandra.hadoop.cql3.CqlRecordWriter$RangeClient.run(CqlRecordWriter.java:271)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection timed out
    at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
    at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
    at org.apache.cassandra.thrift.TFramedTransportFactory.openTransport(TFramedTransportFactory.java:41)
    at org.apache.cassandra.hadoop.AbstractColumnFamilyOutputFormat.createAuthenticatedClient(AbstractColumnFamilyOutputFormat.java:123)
    at org.apache.cassandra.hadoop.cql3.CqlRecordWriter$RangeClient.run(CqlRecordWriter.java:262)
Caused by: java.net.ConnectException: Connection timed out
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:579)
    at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
    ... 4 more

Both traces ultimately point to AbstractColumnFamilyOutputFormat.createAuthenticatedClient(host, port, conf).

I then opened that source and added some detail to the exception so it would output the host name it's connecting to, which resulted in this trace:

java.io.IOException: java.lang.Exception: Unable to connect to host [hostname]
    at org.apache.cassandra.hadoop.cql3.CqlRecordWriter$RangeClient.run(CqlRecordWriter.java:271)
Caused by: java.lang.Exception: Unable to connect to host [hostname]
    at org.apache.cassandra.hadoop.AbstractColumnFamilyOutputFormat.createAuthenticatedClient(AbstractColumnFamilyOutputFormat.java:139)
    at org.apache.cassandra.hadoop.cql3.CqlRecordWriter$RangeClient.run(CqlRecordWriter.java:262)
Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection timed out
    at org.apache.thrift.transport.TSocket.open(TSocket.java:185)
    at org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
    at org.apache.cassandra.thrift.TFramedTransportFactory.openTransport(TFramedTransportFactory.java:41)
    at org.apache.cassandra.hadoop.AbstractColumnFamilyOutputFormat.createAuthenticatedClient(AbstractColumnFamilyOutputFormat.java:124)
    ... 1 more
Caused by: java.net.ConnectException: Connection timed out
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:579)
    at org.apache.thrift.transport.TSocket.open(TSocket.java:180)
    ... 4 more

The problem is [hostname] is a machine that's not in the analytics cluster (it's in us-east). Why doesn't it know this automagically, especially when reads work properly? It seems like it's trying all the nodes in the ring regardless of DC.

For the record, writes fail using CqlOutputFormat, ColumnFamilyOutputFormat, and through Pig using CqlStorage and CassandraStorage.

2

2 Answers

0
votes

I'd say, try to set the write_request_timeout_in_ms in cassandra.yaml to some very high number and see if that helps. There can be an issue with the node itself, when it is not responding while still appearing as being up. If it still times out, restart service on that node that you suspect is causing the issue.

0
votes

This issue came down to two things:

  1. For multi-region EC2 setups, Cassandra requires setting broadcast_address to the public IP and the listen_address to the internal IP. In most cases you'll want rpc_address to be the internal IP, but this potentially breaks Cassandra's Hadoop client, which is determining endpoints to talk to based on broadcast_address.

  2. Cassandra's Hadoop client (RingCache specifically) doesn't respect data center on node discovery, and tries to discover all nodes in the ring--including non-local ones. It respects the consistency level on the actual write, but in our case it never got there due to #1.

I filed a ticket and submitted a patch to address these issues:

https://issues.apache.org/jira/browse/CASSANDRA-7252