0
votes

We have an use case where we do not want to run storm topology continuously. Instead, there are set of inputs( 10K+) that should be processed at the specified time, Spout continuously emits these inputs and get processed by rest of the bolts in the topology. Once all the inputs are processed, there is nothing to emit from nextTuple in my spout.

At this time we wanted our topology to go to sleep and restart the process everyday night 12:00 am.

Is there any property to set in the storm config to run the topology once a day and sleep after processing is done and start at the specified time?

1

1 Answers

0
votes

I'm not aware of a feature like what you're asking for. Storm isn't a batch processing system, it's meant to be running continuously. Consider if Storm is a great fit for this use case.

That said, you should be able to implement what you want. You could put in an "I'm done" message at the end of your spout input. When the spout hits that message and all other pending messages are acked, it could use the Nimbus client to kill or deactivate the topology (depending on whether you want to kill or deactivate), see https://stackoverflow.com/a/37134473/8845188. Then the final step would be using your favorite scheduling software to resubmit/reactivate the topology every day at midnight.