I understand many of the fine details of NAT hole punching, ICE, and SIP VOIP calls. I've answered quite a few questions on SO on these topics. Now I have a question.
I am trying to understand the need for the RE-INVITE message that is documented for SIP+ICE after the call is already established.
Assume a topology of VOIP devices that signal over SIP and using ICE (with STUN/TURN) for establishing media connectivity. After the ICE connectivity checks are performed, both endpoints should have ascertained the best address candidate pairings (IP,port) and should be ready to stream media in both directions.
But my experience with SIP and plenty of documentation suggests that after the callee sends a 200 OK message to indicate he's in the answered state, the caller is to expected send a RE-INVITE with an SDP containing the specific address candidate selected by the connectivity checks.
Some links that describe the RE-INVITE with ICE are here and here (step 8). Rosenberg's tutorial (page 30) discusses that the RE-INVITE "ensures that middleboxes have the correct media address". I'm not sure why that's important.
Upon receiving a RE-INVITE, is the callee expected to reconfigure his ICE stack to switch sockets or addresses based on the new SDP received? Or is the RE-INVITE just a protocol formality to formally acknowledge the call has been established? If the RE-INVITE step was skipped and both sides started streaming media, what could go wrong?
The reason why I ask is because I am exploring using ICE over a signaling service that is not SIP. I'm trying to figure out if the RE-INVITE needs to be emulated.