2
votes

My topology looks like this :

Data_Enrichment_Persistence_Topology

So basically the problem I am trying to solve here is that every time any issue comes in the Stop or Load service bolts, and a tuple fails , it replays and the spout re emits it. This makes the Cassandra bolt re process the tuple and rewrite data.

I can not make the tuples in the load and stop bolts unanchored as i need them to be replayed in case of any failure. However I only want to get the upper workflow replayed.

I am using a KafkaSpout to emit data ( it is emitting it on the " default" stream). Not sure how to duplicate the streams at the Kafka Spout's emit level.

If I can duplicate the streams the replay on any of of the two will only re emit the message on a particular stream right at the spout level leaving the other stream untouched right?

TIA!

1

1 Answers

3
votes

You need to use two output streams in your Spout -- one for each downstream pass. Furthermore, you emit each tuple to both streams (using different message-id).

Thus, if one fails, you can reply this tuple to just this stream.