3
votes

I have a single source and a set of audiences listening to the audio stream. If I stream using P2P WebRTC then the naive approach would be to create N-1 connections from the speaker. which is okay up to N < 3. But otherwise P2P is transmission expensive. I have outlines two approaches and trying to find out which one is best suited.

  1. Have a centralised server relaying and recording the audio. (-latency +cost)
  2. Instead of opening N-1 connections from source some of the terminal nodes will act as non terminal nodes and will open K < N-1 connections to relay and record the transmission. (+latency -cost)

I am very new to WebRTC. I am planning my http side to be with C++. If I take approach (2) I add no extra cost on serverside for audio streaming. But it is not straight forward. I surely don't want to reinvent the wheel if it already exists and spins well. But I don't know what is already available and what are the risks of this approach.

If I take approach (1) what relaying server should I use ? that should tightly integrate with the business logic. This part I am having hard time figuring out. With websocket I find this part easy because all are in same session and all contextual informations are accessible. But here somehow I need to map user accounts with the streams and apply business logics on them. Like for certain users I will lower the volume.

I also need to broadcast data in the same stream.

I can't let anyone (who doesn't use my application) to use my TURN servers. I need some kind of token/auth system for that. How can I do that ?

3

3 Answers

2
votes

You are right, do not reinvent the wheel. Build a webRTC media server from scratch is hard. Luckily there a bunch of options out there to simplify the task. Take a look at:

Also you will have to decide whether to send all the streams to any participant or to mix them in the server and send only one to each participant in the conference. They are called the SFU and MCU approaches.

Regarding the question about how to send data over the same stream, webRTC provides a mechanism called DataChannel, intended just for that. Thanks to it, you can create an stream with different tracks: audio, video, screensharing or data.

1
votes

Instead of opening N-1 connections from source some of the terminal nodes will act as non terminal nodes and will open K < N-1 connections to relay and record the transmission. (+latency -cost)

the broadcast demo of Muaz-Khans rtcmulticonnection demonstrates this so you don't need to reinvent that wheel. As you say it has less cost than a central server but increases latency and it depends on the use-case whether that is better than a central server.

0
votes

I can't let anyone (who doesn't use my application) to use my TURN servers. I need some kind of token/auth system for that. How can I do that ?

The standard way of doing this is to use time-limited credentials. The coturn server supports this out of the box and has a very good explanation in their wiki

Its not perfect but so far there have been no major cases of abuse.