I'm trying to obtain the PCM samples for further processing from a decoded mp4 buffer. I'm first extracting the audio track from a video file recorded with the phone's camera app, and I've made sure the audio track is being selected when I get the 'audio/mp4' mime key:
MediaExtractor extractor = new MediaExtractor();
try {
extractor.setDataSource(fileUri.getPath());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
int numTracks = extractor.getTrackCount();
for(int i =0; i<numTracks; ++i) {
MediaFormat format = extractor.getTrackFormat(i);
String mime = format.getString(MediaFormat.KEY_MIME);
//Log.d("mime =",mime);
if(mime.startsWith("audio/")) {
extractor.selectTrack(i);
decoder = MediaCodec.createDecoderByType(mime);
decoder.configure(format, null, null, 0);
//getSampleCryptoInfo(MediaCodec.CryptoInfo info)
break;
}
}
if (decoder == null) {
Log.e("DecodeActivity", "Can't find audio info!");
return;
}
decoder.start();
After that, I iterate through the track, feeding the codec the stream of encoded access units, and pulling the decoded access units into a ByteBuffer (this is code I recycled from a video rendering example posted here https://github.com/vecio/MediaCodecDemo):
ByteBuffer[] inputBuffers = decoder.getInputBuffers();
ByteBuffer[] outputBuffers = decoder.getOutputBuffers();
BufferInfo info = new BufferInfo();
boolean isEOS = false;
while (true) {
if (!isEOS) {
int inIndex = decoder.dequeueInputBuffer(10000);
if (inIndex >= 0) {
ByteBuffer buffer = inputBuffers[inIndex];
int sampleSize = extractor.readSampleData(buffer, 0);
if (sampleSize < 0) {
// We shouldn't stop the playback at this point, just pass the EOS
// flag to decoder, we will get it again from the
// dequeueOutputBuffer
Log.d("DecodeActivity", "InputBuffer BUFFER_FLAG_END_OF_STREAM");
decoder.queueInputBuffer(inIndex, 0, 0, 0, MediaCodec.BUFFER_FLAG_END_OF_STREAM);
isEOS = true;
} else {
decoder.queueInputBuffer(inIndex, 0, sampleSize, extractor.getSampleTime(), 0);
extractor.advance();
}
}
}
int outIndex = decoder.dequeueOutputBuffer(info, 10000);
switch (outIndex) {
case MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED:
Log.d("DecodeActivity", "INFO_OUTPUT_BUFFERS_CHANGED");
outputBuffers = decoder.getOutputBuffers();
break;
case MediaCodec.INFO_OUTPUT_FORMAT_CHANGED:
Log.d("DecodeActivity", "New format " + decoder.getOutputFormat());
break;
case MediaCodec.INFO_TRY_AGAIN_LATER:
Log.d("DecodeActivity", "dequeueOutputBuffer timed out!");
break;
default:
ByteBuffer buffer = outputBuffers[outIndex];
// How to obtain PCM samples from this buffer variable??
decoder.releaseOutputBuffer(outIndex, true);
break;
}
// All decoded frames have been rendered, we can stop playing now
if ((info.flags & MediaCodec.BUFFER_FLAG_END_OF_STREAM) != 0) {
Log.d("DecodeActivity", "OutputBuffer BUFFER_FLAG_END_OF_STREAM");
break;
}
}
The code seems to work with no errors so far, but I'm currently stuck at trying to figure out how to obtain the PCM samples from the ByteBuffer that is taking the value of the output buffer. I guess I could assume that since I'm working with 16-bit stereo audio file, there should be at least two bytes in an interleaved scheme... however I'm not really sure abut this, so to unequivocally retrieve the PCM samples from this byte stream. Does anybody know how get these from the MediaCodec API?
I've read a couple of alternatives using ffmpeg or openSL, but since I am new to Android programming I was hoping to avoid the complications of using c-based APIs and build my first app using only the tools provided by the Android Framework (I'm using KitKat). Any help will be greatly appreciated.
UPDATE: I was able to extract the PCM samples, the way I was assuming to do it and also the way@marcone pointed out. To do so, I added these lines below the buffer assignment:
byte[] b = new byte[info.size-info.offset];
int a = buffer.position();
buffer.get(b);
buffer.position(a);
and finally write the byte array to a file by:
f.write(b,0,info.size-info.offset);
The problem I'm dealing now with is:
The decoded audio samples do not exactly match with the decoding of the mp4 audio track done by iZotope. there is a 48 samples mismatch in the wave files size, and a 2112 samples delay in the decoded signals. My question now is: would all the mp4 decoders yield the same output PCM stream, or is it dependent on the implementation of the decoder?