7
votes

I'd like to stream a user's webcam (from the browser) to a server and I need the server to be able to manipulate the stream (run some C algorithms on that video stream) and send the user back information.

I have heavily looked at WebRTC and MediaCapture and read the examples here : https://bitbucket.org/webrtc/codelab/overview .

However this is made for peer-to-peer video chat. From what I have understood, the MediaStream from getUserMedia is transmitted via a RTCPeerConnection (with addStream) ; what I'd like to know is : can I use this, but process the video stream on the server ?

Thanks in advance for your help

1
Yes, you can send and manipulate on a server :). What specific questions do you have about it? There are numerous MCU servers out there (check out licode)Benjamin Trent
you would not use the browser API, you should use the native c/c++ WebRTC API and you can get a call from a browser to that app that you build that the native API and manipulate it from there.Benjamin Trent
Okay. I will go check the code and when I'm successful, I'll go back here and post an answer for the others. In between, if somebdy comes here and knows something about this whole thing, he would be very welcomed ! (particularly if he can tell us whether implementing signaling is necessary or not :-) )nschoe
@nschoe Although I haven't used the Native API, signalling still seems crucial for setting up an RTC connection to your server. The SDP (Session Description Protocol) describes who you are, where you are and what media (&codec) you are gonna use (you refers to the browser). ICE candidates are also important to establish the connection. I suggest you to read something about setting up a webRTC connection. This has a lot of information about signalling. The system should work kinda work the same in c.MarijnS95
.. event handlers that trigger whenever a connection is stable, pc.onsignalingstatechange triggers when the signalling has been done right. pc.signalingState contains the current status. The same for the ICE engine: pc.oniceconnectionstatechange and pc.iceGatheringState and pc.iceConnectionState. You can find all this in the w3 spec.MarijnS95

1 Answers

11
votes

Here is the solution I have designed. I post here for people seeking the same kind of information :-)

Front End side

I use the WebRTC API : get webcam stream with getUserMedia, open RTCPeerConnection (and RTCDataChannel for downside information). The stream is DTLS encrypted (mandatory), multimedia streams use RTP and RTCP. The video is VP8 encoded and the audio in Opus encoded.

Back End side

On the backend, this is the complex part. The best (yet) alternative I could find is the Janus Gateway. It takes cares of a lot of stuff, like DTLS handshake, RTP / RTCP demuxing, etc. Basically, it fires a event each time a RTP packet is transmitted. (RTP packets are typically the size of the MTU, so there is not a 1:1 mapping between video frames and RTP packets).

I then built a GStreamer (version 1.0) to depacketize the RTP packets, decode the VP8, ensure video scaling and colorspace / format conversion to issue a BGR matrix (compatible with OpenCV). There is an AppSrc component at the beginning of the pipeline and a AppSink at the end.

What's left to do

I have to take extra measures to ensure good scalability (threads, memory leaks, etc) and find a clean and efficient way of using the C++ library I have inside this program.

Hope this helps !