0
votes

I've got an issue with Apache Storm and Kafka. The KafkaSpout read messages from Kafka normally, but after around 30,000 messages, failed tuple started to appear, Bolt did not receive any message.

I check the worker.log and see, when the topology started, it tried to read partition info from Zookeeper and then in broker and success as you can see: offset 9539

Read partition information from: /twitter_streaming_tweet_test/STREAMING_TWEET_WRITER_SPOUT/partition_2  --> {"partition":2,"offset":9539,"topology":{"name":"DATA_WRITER_TOPOLOGY","id":"DATA_WRITER_TOPOLOGY-67-1516077955"},"topic":"twitter_streaming_tweet_test","broker":{"port":9092,"host":"zoo1"}}

2018-01-16 17:05:57.510 o.a.s.k.PartitionManager Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Read last commit offset from zookeeper: 9539; old topology_id: DATA_WRITER_TOPOLOGY-67-1516077955 - new topology_id: DATA_WRITER_TOPOLOGY-68-1516089922 2018-01-16 17:05:57.514 o.a.s.k.PartitionManager Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Starting Kafka zoo1 Partition{host=zoo1:9092, topic=twitter_streaming_tweet_test, partition=2} from offset 9539 2018-01-16 17:05:57.518 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Task [3/3] Finished refreshing

Then the topology runs normally, until about 30,000 messages

2018-01-16 17:06:39.732 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor[3 3] [INFO] Tweet ID 952850493570654209 was saved to database

2018-01-16 17:06:39.739 TWLogger Thread-9-STREAMING_TWEET_WRITER_BOLT-executor[6 6] [INFO] Tweet ID 952850099335348224 was saved to database 2018-01-16 17:06:39.742 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor[3 3] [INFO] Tweet ID 952850787981393920 was saved to database 2018-01-16 17:06:39.753 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor[3 3] [INFO] Tweet ID 952850152573685760 was saved to database 2018-01-16 17:06:39.754 TWLogger Thread-9-STREAMING_TWEET_WRITER_BOLT-executor[6 6] [INFO] Tweet ID 952850099578654721 was saved to database 2018-01-16 17:06:39.763 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor[3 3] [INFO] Tweet ID 952850153173524481 was saved to database 2018-01-16 17:06:39.768 TWLogger Thread-9-STREAMING_TWEET_WRITER_BOLT-executor[6 6] [INFO] Tweet ID 952850099989704705 was saved to database 2018-01-16 17:06:39.776 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor[3 3] [INFO] Tweet ID 952850153232154624 was saved to database 2018-01-16 17:06:39.779 TWLogger Thread-9-STREAMING_TWEET_WRITER_BOLT-executor[6 6] [INFO] Tweet ID 952850758289956864 was saved to database 2018-01-16 17:06:39.787 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor[3 3] [INFO] Tweet ID 952850154436018176 was saved to database 2018-01-16 17:07:56.106 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Task [3/3] Refreshing partition manager connections 2018-01-16 17:07:56.117 o.a.s.k.DynamicBrokersReader Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Read partition info from zookeeper: GlobalPartitionInformation{topic=twitter_streaming_tweet_test, partitionMap={0=zoo2:9092, 1=zoo3:9092, 2=zoo1:9092}} 2018-01-16 17:07:56.117 o.a.s.k.KafkaUtils Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Task [3/3] assigned [Partition{host=zoo1:9092, topic=twitter_streaming_tweet_test, partition=2}] 2018-01-16 17:07:56.117 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Task [3/3] Deleted partition managers: [] 2018-01-16 17:07:56.117 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Task [3/3] New partition managers: [] 2018-01-16 17:07:56.117 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Task [3/3] Finished refreshing 2018-01-16 17:09:54.150 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Task [3/3] Refreshing partition manager connections 2018-01-16 17:09:54.160 o.a.s.k.DynamicBrokersReader Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Read partition info from zookeeper: GlobalPartitionInformation{topic=twitter_streaming_tweet_test, partitionMap={0=zoo2:9092, 1=zoo3:9092, 2=zoo1:9092}} 2018-01-16 17:09:54.160 o.a.s.k.KafkaUtils Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Task [3/3] assigned [Partition{host=zoo1:9092, topic=twitter_streaming_tweet_test, partition=2}] 2018-01-16 17:09:54.160 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Task [3/3] Deleted partition managers: [] 2018-01-16 17:09:54.160 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Task [3/3] New partition managers: [] 2018-01-16 17:09:54.160 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Task [3/3] Finished refreshing 2018-01-16 17:10:56.108 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor[9 9] [INFO] Task [3/3] Refreshing partition manager connections

Tweets are saved normally, then Kafka Spout tried to read partition info from Zookeeper and cannot find anything, so there is no tuple processed, the topology got stuck. Anyone can help me solve this problem. Thank u so much.

1

1 Answers

0
votes

Can you check your max.spout.pending value? Generally, if it has been set to very high value then eventually, failed tuples will appear in storm stats after a while as messages gets timed out if max.spout.pending is very high. If you can put the storm stats of spouts/bolts and also max.spout.pending value, will help to understand the issue.