I'm investigating as to whether there is a framework/library that will help me implement a distributed computing system.
I have a master that has a large amount of data split up into files of a few hundred megabytes. The files would be chunked up into ~1MB pieces and distributed to workers for processing. Once initialized, the processing on each worker is dependent on state information obtained from the previous chunk, so workers must stay alive throughout the entire process, and the master needs to be able to send the right chunks to the right workers. One other thing to note is that this system is only a piece of a larger processing chain.
I did a little bit of looking into MPI (specifically Open MPI), but I'm not sure if it is the right fit. It seems to be geared to sending small messages (a few bytes), though I did find some charts that show it's throughput increases with larger files (up to 1/5 MB).
I'm concerned that there might not be a way to maintain the state unless it was constantly sent back and forth in messages. Looking at the structure of some MPI examples, it looked like master (rank 0) and workers (ranks 1-n) were a part of the same piece of and their actions were determined by conditionals. Can I have the workers stay alive (maintaining state) and wait for more messages to arrive?
Now that I'm writing this I'm thinking it would work. The rank 1...n section would just be a loop with a blocking receive followed by the processing code. The state would be maintained in that loop until a "no more data" message was received at which point it would send back the results. I might be beginning to grasp the MPI structure here...
My other question about MPI is how to actually run the code. Remember that this system is part of a larger system, so it needs to be called from some other code. The examples I've seen make use of mpirun, with which you can specify how the number of processors, or a hosts file. Can I get the same behavior by calling my MPI function from other code?
So my question is is MPI the right framework here? Is there something better suited to this task, or am I going to be doing this from scratch?