3
votes

I'm implementing a standard MPI master/slave system: there is a master that distributes work, and there are slaves who ask for chunks and process data.

However... if implemented in a naive way (rank==0 is master, the rest are slaves), the master ends up doing no real work, but still takes one core for what needs practically no real computing power. So I tried to implement a separate "scheduler" thread in the master, but that involved sending MPI messages to itself, and didn't really work...

Do you have any ideas how to solve this?

1
Think about whether a scheduler thread is necessary. I'm assuming you are running the application on a single machine. If you use a separate thread in rank 0, you have 2 threads sharing a core. If you just start one more rank in the job, you also have 2 threads sharing a core, but with much simpler code :) - Greg Inozemtsev
Indeed, that would work nicely on one machine... even on more, if rank=0 regularly gets started on the box I'm starting it from (seems to be so this far). Where I'd like to end up running it though is a cluster with a scheduler that will allocate a core for each thread, and I can't really influence that one. - Latanius

1 Answers

3
votes

As I realized after some googling: you can send messages to yourself using tags. Tags are a kind of filter: if you do a recv for only tag==1, then you'll receive only those, with later messages being able to overtake eariler ones.

So, as for the solution:

  • tag the "scheduler to worker" and "worker to scheduler" messages with a different id
  • if rank==0: start a scheduler thread
  • afterwards, regardless of the rank, request work.

This way, the rank 0 worker won't receive its own "let's give me work" messages, because they will have a "to be received by the scheduler only" tag.

Edit: this thing doesn't really seem to be thread-safe though... (= it sometimes crashes in "free()" even though it's written in Python...) so I'd be still interested in the real & proven solution :)