3
votes

As it's inefficiently to ack all messages in Storm, among the whole components of my topology, only some of them needs to guarantee message processing, and I'd like to know is there a proper way to do this.

For instance, I have a TimingBolt which takes tick tuple to make the job work under a specific cycle:

// TimingBolt
@Override
public void execute(Tuple input) {
    if (TupleUtils.isTick(input)) {
        collector.emit(streamA, input, new Values("Tick"));
    } else {
        collector.emit(streamB, new Values("Message"));
    }
}

I want to guarantee the "Tick" message be sent explicitly once to the bolt after TimingBolt

// The AggregateBolt after TimingBolt
@Override
public void execute(Tuple input) {
    if (input.getString(0).equals("Tick")) {
        collector.emit(new Values("Get Tick"));
        collector.ack();
    } else {
        // do something else
        collector.emit(new Values("Not Tick"));
    }
}

and I hope that bolts except TimingBolt and AggregateBolt could be out of range of the ACK tree.

The document http://storm.apache.org/documentation/Guaranteeing-message-processing.html does not show anything about this matter. Is this a valid scene, or is starting ack from spout the only way to make the acker work?

1

1 Answers

2
votes

You have to start from the spout.

To be clear, you're not guaranteed message delivery with what's called "a reliable topology". Instead you're guaranteed that either a tuple and all of it's "descendant tuples" are completely delivered and processed or that the spout will be notified of the failure. Failed messages can be automatically re-emitted, but ultimately there is a small window in which a tuple is no longer retried. In order for that to work, the spout has some reliable tuple behavior that bolts do not: (1) the ability to emit an object id with a tuple and (2) methods that are called with that id when the tuple ultimately succeeds or fails (ack(id) and fail(id), respectively). As bolts don't have those behaviors, you can't start reliable tuple processing from a bolt.

Consider directly configuring your second bolt with TOPOLOGY_TICK_TUPLE_FREQ_SECS.