2
votes

I am trying to use the MediaCodec API for decoding live-stream screen capture from PC by ffmpeg.

For Sender (PC ffmpeg)

i use this command

ffmpeg -re -f gdigrab -s 1920x1080 -threads 4 -i desktop -vcodec libx264 -pix_fmt yuv420p -tune zerolatency -profile:v baseline -flags global_header -s 1280x720 -an -f rtp rtp://192.168.1.6:1234

and output looks like this

Output #0, rtp, to 'rtp://192.168.1.6:1234':
  Metadata:
    encoder         : Lavf56.15.104
    Stream #0:0: Video: h264 (libx264), yuv420p, 1280x720, q=-1--1, 29.97 fps, 90k tbn, 29.97 tbc
Metadata:
  encoder         : Lavc56.14.100 libx264
Stream mapping:
  Stream #0:0 -> #0:0 (bmp (native) -> h264 (libx264))
SDP:
v=0
o=- 0 0 IN IP4 127.0.0.1
s=No Name
c=IN IP4 192.168.1.6
t=0 0
a=tool:libavformat 56.15.104
m=video 1234 RTP/AVP 96
a=rtpmap:96 H264/90000
a=fmtp:96 packetization-mode=1; sprop-parameter-sets=Z0LAH9kAUAW6EAAAPpAADqYI8YMkgA==,aMuDyyA=; profile-level-id=42C01F

Press [q] to stop, [?] for help
frame=   19 fps=0.0 q=17.0 size=     141kB time=00:00:00.63 bitrate=1826.0kbits/
frame=   34 fps= 32 q=17.0 size=     164kB time=00:00:01.13 bitrate=1181.5kbits/
frame=   50 fps= 32 q=18.0 size=     173kB time=00:00:01.66 bitrate= 850.9kbits/

For Receiver (Android MediaCodec)

I created activity with surface and implements SurfaceHolder.Callback

In surfaceChanged

@Override
public void surfaceChanged(SurfaceHolder holder, int format, int width, int height) {
    Log.i("sss", "surfaceChanged");
    if( playerThread == null ) {
        playerThread = new PlayerThread(holder.getSurface());
        playerThread.start();
    }

}

For PlayerThread

class PlayerThread extends Thread {

    MediaCodec decoder;
    Surface surface;

    public PlayerThread(Surface surface) {
        this.surface = surface;
    }

    @Override
    public void run() {
        running = true;
        try {
            MediaFormat format = MediaFormat.createVideoFormat("video/avc", 1280, 720);
            byte[] header = new byte[] {0,0,0,1};
            byte[] sps = Base64.decode("Z0LAH9kAUAW6EAAAPpAADqYI8YMkgA==", Base64.DEFAULT);
            byte[] pps = Base64.decode("aMuDyyA=", Base64.DEFAULT);

            byte[] header_sps = new byte[sps.length + header.length];
            System.arraycopy(header,0,header_sps,0,header.length);
            System.arraycopy(sps,0,header_sps,header.length, sps.length);

            byte[] header_pps = new byte[pps.length + header.length];
            System.arraycopy(header,0, header_pps, 0, header.length);
            System.arraycopy(pps, 0, header_pps, header.length, pps.length);

            format.setByteBuffer("csd-0", ByteBuffer.wrap(header_sps));
            format.setByteBuffer("csd-1", ByteBuffer.wrap(header_pps));
            format.setInteger(MediaFormat.KEY_MAX_INPUT_SIZE, 1280 * 720);
//          format.setInteger("durationUs", 63446722);
//          format.setByteBuffer("csd-2", ByteBuffer.wrap((hexStringToByteArray("42C01E"))));                      
//          format.setInteger(MediaFormat.KEY_COLOR_FORMAT ,MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420Planar);
            Log.i("sss", "Format = " + format);

            try {
                decoder = MediaCodec.createDecoderByType("video/avc");
                decoder.configure(format, surface, null, 0);
                decoder.start();

            } catch (IOException ioEx) {
                ioEx.printStackTrace();
            }

            DatagramSocket socket = new DatagramSocket(1234);
            byte[] bytes = new byte[4096];
            DatagramPacket packet = new DatagramPacket(bytes, bytes.length);

            byte[] data;

            ByteBuffer[] inputBuffers;
            ByteBuffer[] outputBuffers;

            ByteBuffer inputBuffer;
            ByteBuffer outputBuffer;

            MediaCodec.BufferInfo bufferInfo;

            bufferInfo = new MediaCodec.BufferInfo();
            int inputBufferIndex;
            int outputBufferIndex;
            byte[] outData;

            inputBuffers = decoder.getInputBuffers();
            outputBuffers = decoder.getOutputBuffers();

            int minusCount = 0;
            byte[] prevData = new byte[65535];
            List<byte[]> playLoads = new ArrayList<>();
            int playloadSize = 0;
            while (true) {
                try {
                    socket.receive(packet);
                    data = new byte[packet.getLength()];
                    System.arraycopy(packet.getData(), packet.getOffset(), data, 0, packet.getLength());

                inputBufferIndex = decoder.dequeueInputBuffer(-1);
                    Log.i("sss", "inputBufferIndex = " + inputBufferIndex);
                if (inputBufferIndex >= 0)
                {
                    inputBuffer = inputBuffers[inputBufferIndex];
                    inputBuffer.clear();

                    inputBuffer.put(data);


                    decoder.queueInputBuffer(inputBufferIndex, 0, data.length, 0, 0);
//                  decoder.flush();
                }

                outputBufferIndex = decoder.dequeueOutputBuffer(bufferInfo, 10000);
                Log.i("sss", "outputBufferIndex = " + outputBufferIndex);

                while (outputBufferIndex >= 0)
                {
                    outputBuffer = outputBuffers[outputBufferIndex];

                    outputBuffer.position(bufferInfo.offset);
                    outputBuffer.limit(bufferInfo.offset + bufferInfo.size);

                    outData = new byte[bufferInfo.size];
                    outputBuffer.get(outData);


                    decoder.releaseOutputBuffer(outputBufferIndex, false);
                    outputBufferIndex = decoder.dequeueOutputBuffer(bufferInfo, 0);

                }

                } catch (SocketTimeoutException e) {
                    Log.d("thread", "timeout");
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

I think stream from ffmpeg is not a problem because i can open it from mxPlayer via sdp file. And if I pass this stream to local RTSP server (by VLC) then I use MediaPlayer to get RTSP stream, it works but quite slow.

After I looked into the packet I realize that

  • first four bytes is header and sequence number
  • next four bytes is TimeStamp
  • next four bytes is source identifier

So, I cut first 12 bytes out and combine packets with same TimeStamp. Then put it in buffer like this

In while(true) after received packet

                Log.i("sss", "Received = " + data.length + " bytes");
                Log.i("sss","prev " + prevData.length + " bytes = " + getBytesStr(prevData));
                Log.i("sss","data " + data.length + " bytes = " + getBytesStr(data));

                        if(data[4] == prevData[4] && data[5] == prevData[5] && data[6] == prevData[6] && data[7] == prevData[7]){
                            byte[] playload = new byte[prevData.length -12];
                            System.arraycopy(prevData,12,playload, 0, prevData.length-12);
                            playLoads.add(playload);
                            playloadSize += playload.length;
                            Log.i("sss", "Same timeStamp playload " + playload.length + " bytes = " + getBytesStr(playload));
                        } else {
                            if(playLoads.size() > 0){
                                byte[] playload = new byte[prevData.length -12];
                                System.arraycopy(prevData,12,playload, 0, prevData.length-12);
                                playLoads.add(playload);
                                playloadSize += playload.length;
                                Log.i("sss", "last playload " + playload.length + " bytes = " + getBytesStr(playload));

                                inputBufferIndex = decoder.dequeueInputBuffer(-1);
                                if (inputBufferIndex >= 0){
                                    inputBuffer = inputBuffers[inputBufferIndex];
                                    inputBuffer.clear();
                                    byte[] allPlayload = new byte[playloadSize];
                                    int curLength = 0;
                                    for(byte[] playLoad:playLoads){
                                        System.arraycopy(playLoad,0,allPlayload, curLength, playLoad.length);
                                        curLength += playLoad.length;
                                    }
                                    Log.i("sss", "diff timeStamp AlllayLoad " + allPlayload.length + "bytes = " + getBytesStr(allPlayload));
                                    inputBuffer.put(allPlayload);

                                    decoder.queueInputBuffer(inputBufferIndex, 0, data.length, 0, 0);
                                    decoder.flush();
                                }

                                bufferInfo = new MediaCodec.BufferInfo();
                                outputBufferIndex = decoder.dequeueOutputBuffer(bufferInfo, 10000);
                                if(outputBufferIndex!= -1)
                                    Log.i("sss", "outputBufferIndex = " + outputBufferIndex);

                                playLoads = new ArrayList<>();
                                prevData = new byte[65535];
                                playloadSize = 0;
                            }

                        }

                    prevData = data.clone();

The outputBufferIndex still return -1

If I change timeoutUS from 10000 to -1, it never go to next line

I've searched for a week but still no luck T_T

Why dequeueOutputBuffer always return -1?

What is the problem of my code?

Could you properly optimize my code to work correctly?

Thanks for your help.

Edit#1

Thanks @mstorsjo guide me to Packetization and i found useful infomation

How to process raw UDP packets so that they can be decoded by a decoder filter in a directshow source filter

Then i edited my code below

if((data[12] & 0x1f) == 28){
   if((data[13] & 0x80) == 0x80){ //found start bit
      inputBufferIndex = decoder.dequeueInputBuffer(-1);
      if (inputBufferIndex >= 0){
         inputBuffer = inputBuffers[inputBufferIndex];
         inputBuffer.clear();
         byte result = (byte)((bytes[12] & 0xe0) + (bytes[13] & 0x1f));
         inputBuffer.put(new byte[] {0,0,1});
         inputBuffer.put(result);
         inputBuffer.put(data,14, data.length-14);
      }

   } else if((data[13] &0x40) == 0x40){ //found stop bit
      inputBuffer.put(data, 14, data.length -14);
      decoder.queueInputBuffer(inputBufferIndex, 0, data.length, 0, 0);
      bufferInfo = new MediaCodec.BufferInfo();
      outputBufferIndex = decoder.dequeueOutputBuffer(bufferInfo, 10000);

      switch(outputBufferIndex)
      {
         case MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED:
            outputBuffers = decoder.getOutputBuffers();
            Log.w("sss", "Output Buffers Changed");
            break;
         case MediaCodec.INFO_OUTPUT_FORMAT_CHANGED:
            Log.w("sss", "Output Format Changed");
            MediaFormat newFormat = decoder.getOutputFormat();
            Log.i("sss","New format : " + newFormat);

            break;
         case MediaCodec.INFO_TRY_AGAIN_LATER:
            Log.w("sss", "Try Again Later");
            break;
         default:
            outputBuffer = outputBuffers[outputBufferIndex];
            outputBuffer.position(bufferInfo.offset);
            outputBuffer.limit(bufferInfo.offset + bufferInfo.size);
            decoder.releaseOutputBuffer(outputBufferIndex, true);

      }
   } else {
      inputBuffer.put(data, 14, data.length -14);
   }
 }

Now i can see some picture but most of screen is gray

What should i do next??

Thank you.

2

2 Answers

1
votes

You can't just discard the RTP header and pretend that the rest of the packet is a normal H264 frame - it isn't. See RFC 6184 for an explanation of the format used when H264 is packetized into RTP. You need to undo this packetization to bring the data back into a format that a normal decoder can handle. You can have a look at libavformat/rtpdec_h264.c in libav/ffmpeg for an example on how to do this.

0
votes

This might be late, but I can see two possible problems.

1) You're only looking at NAL units with NAL type 28 (FU-A), but ffmpeg is sending NAL units with type 1, 24 and 28. The type 24 NAL units can be ignored without risk, but the type 1 NAL units can not be ignored (they have NRI > 0).

2) The rtp stream will not necessarily arrive in the order they are sent. Thus it is possible for a frame to be reconstructed in the wrong order. To ensure the right order you would have to look at the timestamps in the rtp headers.

A good library I found that does this is Anroid Streaming Client. You would need to modify it slightly to use the correct csd-0/csd-1 in the MediaFormat and to have the output buffer to an arbitrary surface instead of one from a SurfaceView.