1
votes

I am able to see the video playing in my TextureView but it is fairly corrupted. I have verified that I am receiving complete packets in the correct order. I have been able to parse the RTP header correctly. I believe my issue is related to the SPS and PPS and the MediaCodec.

My understanding is that you are supposed to strip the RTP header from the message and insert an RTP start code of 0x00000001 to the start of your message so that your input buffer to the decoder is of the form 0x00000001[sps] 0x00000001[pps] 0x00000001[video data].

My confusion is that the MediaCodec appears to require a MediaFormat with the SPS and PPS manually defined separately. I have found this example that I am currently using along with the message format I have defined above:

MediaFormat format = MediaFormat.createVideoFormat(MediaFormat.MIMETYPE_VIDEO_AVC, width, height);

// from avconv, when streaming sample.h264.mp4 from disk
byte[] header_sps = {0, 0, 0, 1, 0x67, 0x64, (byte) 0x00, 0x1e, (byte) 0xac, (byte) 0xd9, 0x40, (byte) 0xa0, 0x3d,
            (byte) 0xa1, 0x00, 0x00, (byte) 0x03, 0x00, 0x01, 0x00, 0x00, 0x03, 0x00, 0x3C, 0x0F, 0x16, 0x2D, (byte) 0x96}; // sps
byte[] header_pps = {0, 0, 0, 1, 0x68, (byte) 0xeb, (byte) 0xec, (byte) 0xb2, 0x2C}; // pps


format.setByteBuffer(CSD_0, ByteBuffer.wrap(header_sps));
format.setByteBuffer(CSD_1, ByteBuffer.wrap(header_pps));

As you can see, I am not providing the MediaFormat with the SPS and PPS from my video stream, but instead using a hard coded set from an internet example. I've tried to find sources explaining how to extract the SPS and PPS from a packet, but haven't been able to find anything.

Questions:

Am I supposed to strip the SPS and PPS from my buffer before passing it to the MediaCodec if the MediaFormat is already being provided the SPS and PPS?

How do you correctly parse the SPS and PPS from a message?

Here's the first few bytes of one of my RTP packets with the header included:

80 a1 4c c3 32 2c 24 7a f5 5c 9f bb 47 40 44 3a 40 0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 0 0 1 c0 0 71 80 80 5 21 0 5d d6 d9 ff fb 12 c4 e7 0 5 5c 41 71 2c 30 c1 30 b1 88 6c f5 84 98 2c 82 f5 84 82 44 96 72 45 ca 96 30 35 91 83 86 42 e4 90 28 b1 81 1a 6 57 a8 37 b0 60 56 81 72 71 5c 58 a7 4e af 67 bd 10 13 1 af e9 71 15 13 da a0 15 d5 72 38 36 2e 35 11 31 10 a4 12 1e 26 28 40 b5 3b 65 8c 30 54 8a 96 1b c5 a7 b5 84 cb a9 aa 3d d4 53 47 0 45 34 55 0 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff bf 9 95 2b 73 93 4e c3 f9 b1 d0 5f f5 de c9 9e f7 f8 23 ab a5 aa

1

1 Answers

1
votes

Yes you are correct that the mediacodec requires the SPS and PPS to be initialized first. You must extract the SPS/PPS from the SDP response which is the reply from the DESCRIBE command sent to the server(camera) during the RTSP handshake. Within the SDP response there is a sprop parameter set which contains the SPS/PPS. You can see them on WireShark as:

Media format specific parameters: sprop-parameter-sets=Z2QAKKwbGoB4AiflwFuAgICgAAB9AAAOph0MAHz4AAjJdd5caGAD58AARkuu8uFAAA==,aO44MAA=

They are separated by comma and must be decoded using Base64. See this for an explanation: How to decode sprop-parameter-sets in a H264 SDP?