6
votes

I know there needs to be a STUN/ICE/TURN server to find the IP addresses of the peers involved in a WebRTC communication. However, even after IPs are found, how do the peers actually talk to each other independently without having any ports opened?

If you build a website, you usually have to open the ports on your server to have others access your site. What's the magic that is happening in WebRTC that I'm not understanding?

2

2 Answers

8
votes

There are several strategies to do this: one possibility is for the client to explicitly open a port via UPnP. I'm not sure if any current WebRTC client does so, but in general networking this is a possibility.

Failing that, the STUN server kicks in. There are several hole punching techniques it can try; read the aforelinked article for the gory details. In short though, a firewall will usually open a port for outgoing traffic (because it needs to receive responses), so by establishing an outgoing connection to a known target and then making note of the port that was opened it is possible to open a port.

Failing even that, a TURN server is necessary. This server is publicly accessible from both peers, even if both peers cannot see each other. The TURN server then will act as a relay between the two. This somewhat negates the point of a P2P protocol, but is necessary in a certain percentage of situations (estimates range around 10%-20%).

5
votes

The original Question is "what/who creates the sockets?"

  • The browsers creates the socket and bind them to a local port for you during the "ICE gathering".
  • Wether you use any stun/turn server or not, each candidate generated during the ice gathering has a corresponding port open.
  • Those ports are usually open only for 30 mn after which they are revoked to avoid an attack by someone using old and/or spoof candidates. These 30mns are not specified in any specification and are an arbitrary choice by the browser vendor. -

The next question is "how does the remote peer know about which ports are open".

  • through the ICE mechanism, which for each media will generate potential candidates and send them to the remote peer through your preferred signaling channel.
  • ICE candidates (which are one line of SDP, really) have a "type". if this type is HOST, then your candidate is a local candidate generated without the use of any stun or turn server. is the type is SRFLX, then you have used a STUN server to add the mapping between your local IP:port and your public IP:port. if your type is RELAY, same thing with a TURN server.
  • of course, using the local IP:port HOST candidate will fail unless the remote peer is on the same local network.
  • From the browser and local system point of view, the socket is open on the local IP:PORT anyway. Hence, opening the sockets and finding out on which port a remote peer should connect to connect to the socket are separate problems handled separately.

The Final question is: "can it really work without a STUN server"

  • Most probably no, unless you are on the same sub network.
  • Stats shows (http://webrtcstats.com) that even with a STUN server, you still fail in 8% of the case, for the general public. It's much more in enterprise, where you'd better have advanced turn (supporting tunneling through TCP/80 and TLS/443) and even support for HTTP proxy's CONNECT method.