I am implementing a decoder using MediaCodec
Java API for decoding live H.264 remote stream. I am receiving H.264 encoded data from native layer using a callback (void OnRecvEncodedData(byte[] encodedData)
), decode and render on Surface
of TextureView
. My implementation is completed (retrieving encoded streams using callback, decode and rendering etc). Here is my decoder class:
public class MediaCodecDecoder extends Thread implements MyFrameAvailableListener {
private static final boolean VERBOSE = true;
private static final String LOG_TAG = MediaCodecDecoder.class.getSimpleName();
private static final String VIDEO_FORMAT = "video/avc"; // h.264
private static final long mTimeoutUs = 10000l;
private MediaCodec mMediaCodec;
Surface mSurface;
volatile boolean m_bConfigured;
volatile boolean m_bRunning;
long startMs;
public MediaCodecDecoder() {
JniWrapper.SetFrameAvailableListener(this);
}
// this is my callback where I am receiving encoded streams from native layer
@Override
public void OnRecvEncodedData(byte[] encodedData) {
if(!m_bConfigured && bKeyFrame(encodedData)) {
Configure(mSurface, 240, 320, encodedData);
}
if(m_bConfigured) {
decodeData(encodedData);
}
}
public void SetSurface(Surface surface) {
if (mSurface == null) {
mSurface = surface;
}
}
public void Start() {
if(m_bRunning)
return;
m_bRunning = true;
start();
}
public void Stop() {
if(!m_bRunning)
return;
m_bRunning = false;
mMediaCodec.stop();
mMediaCodec.release();
}
private void Configure(Surface surface, int width, int height, byte[] csd0) {
if (m_bConfigured) {
Log.e(LOG_TAG, "Decoder is already configured");
return;
}
if (mSurface == null) {
Log.d(LOG_TAG, "Surface is not available/set yet.");
return;
}
MediaFormat format = MediaFormat.createVideoFormat(VIDEO_FORMAT, width, height);
format.setByteBuffer("csd-0", ByteBuffer.wrap(csd0));
try {
mMediaCodec = MediaCodec.createDecoderByType(VIDEO_FORMAT);
} catch (IOException e) {
Log.d(LOG_TAG, "Failed to create codec: " + e.getMessage());
}
startMs = System.currentTimeMillis();
mMediaCodec.configure(format, surface, null, 0);
if (VERBOSE) Log.d(LOG_TAG, "Decoder configured.");
mMediaCodec.start();
Log.d(LOG_TAG, "Decoder initialized.");
m_bConfigured = true;
}
@SuppressWarnings("deprecation")
private void decodeData(byte[] data) {
if (!m_bConfigured) {
Log.e(LOG_TAG, "Decoder is not configured yet.");
return;
}
int inIndex = mMediaCodec.dequeueInputBuffer(mTimeoutUs);
if (inIndex >= 0) {
ByteBuffer buffer;
if (Build.VERSION.SDK_INT < Build.VERSION_CODES.LOLLIPOP) {
buffer = mMediaCodec.getInputBuffers()[inIndex];
buffer.clear();
} else {
buffer = mMediaCodec.getInputBuffer(inIndex);
}
if (buffer != null) {
buffer.put(data);
long presentationTimeUs = System.currentTimeMillis() - startMs;
mMediaCodec.queueInputBuffer(inIndex, 0, data.length, presentationTimeUs, 0);
}
}
}
private static boolean bKeyFrame(byte[] frameData) {
return ( ( (frameData[4] & 0xFF) & 0x0F) == 0x07);
}
@Override
public void run() {
try {
MediaCodec.BufferInfo info = new MediaCodec.BufferInfo();
while(m_bRunning) {
if(m_bConfigured) {
int outIndex = mMediaCodec.dequeueOutputBuffer(info, mTimeoutUs);
if(outIndex >= 0) {
mMediaCodec.releaseOutputBuffer(outIndex, true);
}
} else {
try {
Thread.sleep(10);
} catch (InterruptedException ignore) {
}
}
}
} finally {
Stop();
}
}
}
Now the problem is - the streams is being decoded and rendered on surface but the video is not clear. It seems like the frames are broken and scene is distorted/dirty. The movement is broken and square shaped fragments everywhere (I am really sorry as I don't have the screenshot right now).
About my streams - its H.264 encoded and consists of I frames and P frames only (there is no B frame). Every I frame has SPS + PPS + payload
structure. The color format used during encoding (using FFMPEG in native layer) is YUV420 planner. The sent length of data from native layer is okay (width * height * (3 / 2)).
During configure()
I just set the csd-0
value with SPS frame. The frame used for configuration was an I frame (SPS + PPS + payload) - the prefix was a SPS frame, so I think the configuration was successful. Note that, I didn't set the csd-1
value with PPS frame (is it a problem?).
Every frame has preceding start codes (0x00 0x00 0x00 0x01
) for both p-frame and I-frame (for I-frame the start code is present both infront of SPS and PPS frame).
Moreover, I am setting the presentation timestamp as System.currrentTimeMillis() - startTime
for every frame which is increasing order for every new frame. I think this shouldn't cause any problem (Correct me if I am wrong).
My device is Nexus 5 from Google with Android version 4.4.4 and chipset is Qualcomm MSM8974 Snapdragon 800. I am using Surface
for decoding, so I think there should not be any device specific color format mismatch issues.
I can also provide my TextureView
code if needed.
What might be the cause of my incorrect decoding/rendering? Thanks in advance!
EDIT 1
I tried by manually passing my codec-specific data(SPS and PPS bytes) during configuration. But this didn't make any change :(
byte[] sps = {0x00, 0x00, 0x00, 0x01, 0x67, 0x4d, 0x40, 0x0c, (byte) 0xda, 0x0f, 0x0a, 0x68, 0x40, 0x00, 0x00, 0x03, 0x00, 0x40, 0x00, 0x00, 0x07, (byte) 0xa3, (byte) 0xc5, 0x0a, (byte) 0xa8};
format.setByteBuffer("csd-0", ByteBuffer.wrap(sps));
byte[] pps = {0x00, 0x00, 0x00, 0x01, 0x68, (byte) 0xef, 0x04, (byte) 0xf2, 0x00, 0x00};
format.setByteBuffer("csd-1", ByteBuffer.wrap(pps));
I also tried by trimming the start codes (0x00, 0x00, 0x00, 0x01
) but no progress!
EDIT 2
I tried with hardware accelerated {{TextureView}} as it is mentioned in official documentation (though I didn't find any H/W acceleration code in sample project of MediaCodec-textureView). But still no progress. Now I commented the H/W acceleration code snippet.
EDIT 3
The screenshots are avilable now:
EDIT 4
For further clarification, this is my H.264 encoded I-frame hex stream format:
00 00 00 01 67 4d 40 0c da 0f 0a 68 40 00 00 03 00 40 00 00 07 a3 c5 0a a8 00 00 00 01 68 ef 04 f2 00 00 01 06 05 ff ff 69 dc 45 e9 bd e6 d9 48 b7 96 2c d8 20 d9 23 ee ef 78 32 36 34 20 2d 20 63 6f 72 65 20 31 34 36 20 2d 20 48 2e 32 36 34 2f 4d 50 45 47 2d 34 20 41 56 43 20 63 6f 64 65 63 20 2d 20 43 6f 70 79 6c 65 66 74 20 32 30 30 33 2d 32 30 31 35 20 2d 20 68 74 74 70 3a 2f 2f 77 77 77 2e 76 69 64 65 6f 6c 61 6e 2e 6f 72 67 2f 78 32 36 34 2e 68 74 6d 6c 20 2d 20 6f 70 74 69 6f 6e 73 3a 20 63 61 62 61 63 3d 31 20 72 65 66 3d 31 20 64 65 62 6c 6f 63 6b 3d 31 3a 30 3a 30 20 61 6e 61 6c 79 73 65 3d 30 78 31 3a 30 78 31 20 6d 65 3d 68 65 78 20 73 75 62 6d 65 3d 30 20 70 73 79 3d 31 20 70 73 79 5f 72 64 3d 31 2e 30 30 3a 30 2e 30 30 20 6d 69 78 65 64 5f 72 65 66 3d 30 20 6d 65 5f 72 61 6e 67 65 3d 31 36 20 63 68 72 6f 6d 61 5f 6d 65 3d 31 20 74 72 65 6c 6c 69 73 3d 30 20 38 78 38 64 63 74
And this is a P-frame:
00 00 00 01 41 9a 26 22 df 76 4b b2 ef cf 57 ac 5b b6 3b 68 b9 87 b2 71 a5 9b 61 3c 93 47 bc 79 c5 ab 0f 87 34 f6 40 6a cd 80 03 b1 a2 c2 4e 08 13 cd 4e 3c 62 3e 44 0a e8 97 80 ec 81 3f 31 7c f1 29 f1 43 a0 c0 a9 0a 74 62 c7 62 74 da c3 94 f5 19 23 ff 4b 9c c1 69 55 54 2f 62 f0 5e 64 7f 18 3f 58 73 af 93 6e 92 06 fd 9f a1 1a 80 cf 86 71 24 7d f7 56 2c c1 57 cf ba 05 17 77 18 f1 8b 3c 33 40 18 30 1f b0 19 23 44 ec 91 c4 bd 80 65 4a 46 b3 1e 53 5d 6d a3 f0 b5 50 3a 93 ba 81 71 f3 09 98 41 43 ba 5f a1 0d 41 a3 7b c3 fd eb 15 89 75 66 a9 ee 3a 9c 1b c1 aa f8 58 10 88 0c 79 77 ff 7d 15 28 eb 12 a7 1b 76 36 aa 84 e1 3e 63 cf a9 a3 cf 4a 2d c2 33 18 91 30 f7 3c 9c 56 f5 4c 12 6c 4b 12 1f c5 ec 5a 98 8c 12 75 eb fd 98 a4 fb 7f 80 5d 28 f9 ef 43 a4 0a ca 25 75 19 6b f7 14 7b 76 af e9 8f 7d 79 fa 9d 9a 63 de 1f be fa 6c 65 ba 5f 9d b0 b0 f4 71 cb e2 ea d6 dc c6 55 98 1b cd 55 d9 eb 9c 75 fc 9d ec
I am pretty sure about my stream's correctness as I successfully rendered using ffmpeg
decoding and GLSurfaceview
with OpenGLES 2.0
.