What guarantees does erlang's "monitor" give?

Question

While reading the ERTS user's guide, I found this section:

The only signal ordering guarantee given is the following. If an entity sends multiple signals to the same destination entity, the order will be preserved. That is, if A sends a signal S1 to B, and later sends the signal S2 to B, S1 is guaranteed not to arrive after S2.

I've also happened across this while doing further ~~research~~ googling:

Erlang Reference Manual, 13.5:

Message sending is asynchronous and safe, the message is guaranteed to eventually reach the recipient, provided that the recipient exists.

That seems very vague and I'd like to know what guarantees I can rely on in the following scenario:

 A,B are processes on two different nodes.
 Assume A does not crash and B was a valid node at some point.
 A and B monitor each other.
 A sends messages M1,M2,M3 to B

In the above scenario, is it possible that B receives M1,M3 (M2 is dropped), without any sort of 'DOWN'/'EXIT'/heartbeat timeout being received at A?

Really depends on what A and B represent. In the case of a local process, that's not possible: all messages will be received by B (though B may crash before processing M2 or M3). In the case of a remote process, it's highly unlikely, but yes, that would imply a disconnection between the nodes. — Soup d'Campbells
@Soupd'Campbells Do you have a link to documentation that would back that up? Especially, is this 'guaranteed' rather than just something that results from the way it's implemented? — Alexander

I GIVE CRAP ANSWERS I GIVE CRAP ANSWERS · Accepted Answer · 2014-06-20T10:23:02

There are no other guarantees other than the ordering guarantee. Note that by default you don't even know who the sender is, unless the sender encodes this in the message.

Your example could happen:

A sends M1 and M2
B receives M1
The node on which B resides gets disconnected
The node on which B resides comes up again
A sends M3 to B
B receives M3

M2 can be lost on the network link in this scenario. It is highly unlikely this happens, but it can happen. The usual trick is to have some kind of notion of such errors. Either by having a timeout trigger, or by monitoring the node or Pid which is the recipient of the message.

Updated scenario:

In the updated scenario, provided I read it correctly, then A would get a 'DOWN' style message at some point, and likewise, it would get a message telling you that the node is up again, if you monitor the node.

Though often, such things are better modeled using an idempotent protocol if at all possible.

What guarantees does erlang's "monitor" give?

2 Answers

Updated scenario: