2
votes

While reading the ERTS user's guide, I found this section:

The only signal ordering guarantee given is the following. If an entity sends multiple signals to the same destination entity, the order will be preserved. That is, if A sends a signal S1 to B, and later sends the signal S2 to B, S1 is guaranteed not to arrive after S2.

I've also happened across this while doing further research googling:

Erlang Reference Manual, 13.5:

Message sending is asynchronous and safe, the message is guaranteed to eventually reach the recipient, provided that the recipient exists.

That seems very vague and I'd like to know what guarantees I can rely on in the following scenario:

 A,B are processes on two different nodes.
 Assume A does not crash and B was a valid node at some point.
 A and B monitor each other.
 A sends messages M1,M2,M3 to B

In the above scenario, is it possible that B receives M1,M3 (M2 is dropped), without any sort of 'DOWN'/'EXIT'/heartbeat timeout being received at A?

2
Really depends on what A and B represent. In the case of a local process, that's not possible: all messages will be received by B (though B may crash before processing M2 or M3). In the case of a remote process, it's highly unlikely, but yes, that would imply a disconnection between the nodes.Soup d'Campbells
@Soupd'Campbells Do you have a link to documentation that would back that up? Especially, is this 'guaranteed' rather than just something that results from the way it's implemented?Alexander
I've updated the question with a more specific scenario.Alexander

2 Answers

3
votes

There are no other guarantees other than the ordering guarantee. Note that by default you don't even know who the sender is, unless the sender encodes this in the message.

Your example could happen:

  • A sends M1 and M2
  • B receives M1
  • The node on which B resides gets disconnected
  • The node on which B resides comes up again
  • A sends M3 to B
  • B receives M3

M2 can be lost on the network link in this scenario. It is highly unlikely this happens, but it can happen. The usual trick is to have some kind of notion of such errors. Either by having a timeout trigger, or by monitoring the node or Pid which is the recipient of the message.

Updated scenario:

In the updated scenario, provided I read it correctly, then A would get a 'DOWN' style message at some point, and likewise, it would get a message telling you that the node is up again, if you monitor the node.

Though often, such things are better modeled using an idempotent protocol if at all possible.

1
votes

Reading through the erlang mailing-list and the academic faq, it seems like there are a few guarantees provided by the ERTS implementation, however I was not able to determine whether or not they are guaranteed at a language/specification level, too.

If you assume TCP is "reliable", then the current implementation guarantees that
given A,B are processes on different nodes (&hosts) and A monitors B, A sends to B, assuming A doesn't crash, any message delivery failures* between the two nodes or host/node/process failures on B will lead to A getting a 'DOWN' message (or 'EXIT' in the case of links). [ See 1 and 2 ]

*From what I have read on the mailing-list thread , this property is almost entirely based on the fact that TCP is used, so "message delivery failure" means any situation where TCP decides that a failure has occurred/the connection needs to be closed.

The academic faq talks about this like it's also a language/specification level guarantee, however I couldn't yet find anything to back that up.