If you don't know that there is going to be a message, one way to handle this would probably be to have the master process act as a message queue, basically in an endless receive loop until all tasks have sent an exit signal.
This way, you don't have to worry about mixed send/receive counts between neighbor tasks, and each task can independently poll the master task occasionally to see if there is work, and or a message, for it.
This can get a bit complicated to handle, especially of the message queue and to make sure that when a message is waiting that the two tasks then initiate a short send/receive session, but it would prevent a lot of broadcast / all-to-all traffic, which is notoriously expensive time wise.
If the master task is largely only handling message status, it should be very low friction interconnect wise.
Basically for Task 0:
while (1){
MPI_recv(args);
if (any task has a message to send){ add it to the master's queue; }
if (any task is waiting to contact the task that is now polling for messages){ tell current task to initiate a Recv wait and signal the master that it is waiting; }
if (any other task is waiting for the current task to send to it){ initiate a send to that task; }
}
Each task:
if (work needs to be sent to a neighbor){ contact master; master adds to queue; }
if (a neighbor wants to send work){ enter receive loop ; }
if (a neighbor is ready to receive work according to the master){ send work to it; }
Or:
Task 3 has work for Task 8.
Task 3 contacts Master, and says so.
Task 3 continues its business.
Task 8 contacts Master, and sees it has work from Task 3 pending.
Task 8 enters a receive.
Task 3 (at some polling interval) again contacts master to check on work, and sees there is a task awaiting work.
Task 3 initiates a send.
Task 3 initiates a send for any remaining tasks waiting on work.
Program continues.
Now, this has plenty of its own caveats. Whether or not it is efficient depends on how frequently messages are passed between tasks, and how long a given task sits in wait for its neighbor to send.
And in the end, this may not be better than a simply Alltoall(). And it's entirely possible that this solution is downright terrible.