recently I'm developing some timing tools for storm topology, but I still have some questions about sharing data in storm cluster:
If a component(spout/bolt) is configured with more than one executors per worker, say the worker number is one, the parallelism_hint of the component is 3 and the task number uses default setting(i.e. 1), does that mean there are 3 instances of the component in the worker? If not, should the field of the component be used in a synchronized block?
If an additional thread named "athread" is created in a component(within
prepare()oropen()method), how many "athread" instances are there in the storm cluster?As Understanding the Parallelism of a Storm Topology says, a worker is a separate process, and a worker process executes a subset of a topology. Does that mean global variables (such as public static fields or other static variables) of the topology can only be shared in one worker?
If a spout's parallelism_hint is configured greater than 1, and there is a
Utils.sleep(1000)sentence innextTuple()method, does that mean the number of emitted tuples of the spout is equal to the executors'(threads) number of the spout every second?
Thanks very much.