2
votes

I am trying to get familiar with the anatomy of a SIP SDP. Here is a sample SDP from my Tandberg VC unit.

v=0
o=tandberg 1 3 IN IP4 192.168.1.94
s=-
c=IN IP4 192.168.1.94
b=AS:768
t=0 0
m=audio 47032 RTP/AVP 97 98 99 100 101 9 15 8 0 102
b=TIAS:64000
a=rtpmap:97 MP4A-LATM/90000
a=fmtp:97 profile-level-id=24;object=23;bitrate=64000
a=rtpmap:98 MP4A-LATM/90000
a=fmtp:98 profile-level-id=24;object=23;bitrate=56000
a=rtpmap:99 MP4A-LATM/90000
a=fmtp:99 profile-level-id=24;object=23;bitrate=48000
a=rtpmap:100 G7221/16000
a=fmtp:100 bitrate=32000
a=rtpmap:101 G7221/16000
a=fmtp:101 bitrate=24000
a=rtpmap:9 G722/8000
a=rtpmap:15 G728/8000
a=rtpmap:8 PCMA/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:102 telephone-event/8000
a=fmtp:102 0-15
m=video 47034 RTP/AVP 122 121 120 34 31
b=TIAS:768000
a=rtpmap:122 H264-RCDO/90000
a=fmtp:122 profile-level-id=008016;max-mbps=42000;max-fs=3600;max-smbps=323500
a=rtpmap:121 H264/90000
a=fmtp:121 profile-level-id=428016;max-mbps=35000;max-fs=3600;max-smbps=323500
a=rtpmap:120 H263-1998/90000
"a=fmtp:120 custom=1280,720,3;custom=1024,768,4;custom=1024,576,2;custom=800,600,3;cif4=2;custom=720,480,2;custom=640,480,2;
custom=512,288,1;cif=1;custom=352,240,1;qcif=1;sqcif=1;maxbr=7680"
a=rtpmap:34 H263/90000
a=fmtp:34 cif4=2;cif=1;qcif=1;sqcif=1;maxbr=7680
a=rtpmap:31 H261/90000
a=fmtp:31 cif=1;qcif=1;maxbr=7680
a=rtcp-fb:* nack pli
a=content:main
a=label:11
a=answer:full
m=application 5071 UDP/BFCP *
a=floorctrl:c-s
a=confid:1
a=floorid:2 mstrm:12
a=userid:1
a=setup:passive
a=connection:new
m=video 47036 RTP/AVP 120 34 31
b=TIAS:768000
a=rtpmap:120 H263-1998/90000
"a=fmtp:120 custom=1280,720,3;custom=1024,768,4;custom=1024,576,2;custom=800,600,3;cif4=2;custom=720,480,2;custom=640,480,2;custom=512,
288,1;cif=1;custom=352,240,1;qcif=1;sqcif=1;maxbr=7680"
a=rtpmap:34 H263/90000
a=fmtp:34 cif4=2;cif=1;qcif=1;sqcif=1;maxbr=7680
a=rtpmap:31 H261/90000
a=fmtp:31 cif=1;qcif=1;maxbr=7680
a=rtcp-fb:* nack pli
a=content:slides
a=label:12
m=application 47038 RTP/AVP 103
a=rtpmap:103 H224/4800

So my understanding is that RTP/AVP protocol can only be used with media-type audio or video. Keeping this in view I didn't understand the last two lines:

m=application 47038 RTP/AVP 103
a=rtpmap:103 H224/4800

Any ideas on what they mean?

2

2 Answers

2
votes

So my understanding is that RTP/AVP protocol can only be used with media-type audio or video.

There is no such restriction, RFC4566 states that

is the media type. Currently defined media are "audio", "video", "text", "application", and "message", although this list may be extended in the future (see Section 8).

Application-specific messages can also be sent over RTP, in your case the

m=application 47038 RTP/AVP 103

a=rtpmap:103 H224/4800

lines refer to RFC4573 which is a payload format used for remote camera control.

1
votes

You use SDP to negotiate a session between two peers. A session may consist of multiple media lines. If we want to use audio and video inside (= video-call) we need two media lines. Based on RFC4566 a media line is described as:

m= media port proto fmt ...

Where media can be:

is the media type. Currently defined media are "audio", "video", "text", "application", and "message",

So in our case we would need two media lines, one for audio, one for video. Each media line describes the transport protocol port (e.g. UDP for audio) where e.g. audio shall be received.

So in your example the sender of the SDP message wants to receive packets on port 47038. Additionally we RTP to transmit information. AVP stands for audio video profile (see Wikipedia). In RTP we've a range of predefined codec numbers, e.g. number 0 stands for PCM U-law. In your case you're using a number of a dynamic range -> the idea is that I should be able to extend the codec map in RTP. Therefore RTP defines a dynamic codec number range ( = 96 -127). We using a dynamic codec this codec has to be described in more detail. That's the job of the a=-line (attribute-line) below the media line.

RFC 4566:

Attributes are the primary means for extending SDP. Attributes may be defined to be used as "session-level" attributes, "media-level" attributes, or both.

A media description may have any number of attributes ("a=" fields) that are media specific. These are referred to as "media-level" attributes and add information about the media stream. Attribute fields can also be added before the first media field; these "session-level" attributes convey additional information that applies to the conference as a whole rather than to individual media.

So you a=-line describes that the above media line uses a H224 codec for RTP, where the payload type number in RTP is set to 103. I guess that 4800 stands for the codec's sampling rate.

Hope that helps.