i'm recording sound and encoding to mp3 with ffmpeg lib. then decode the mp3 data right away, play the decode data, but it's sounds so delayed. here are the codes: the function encode first parameter accepts the raw pcm data, len = 44100.
encode parameters:
cntx_->channels = 1;
cntx_->sample_rate = 44100;
cntx_->sample_fmt = 6;
cntx_->channel_layout = AV_CH_LAYOUT_MONO;
cntx_->bit_rate = 8000;
err_ = avcodec_open2(cntx_, codec_, NULL);
vector<unsigned char> encode(unsigned char* encode_data, unsigned int len)
{
vector<unsigned char> ret;
AVPacket avpkt;
av_init_packet(&avpkt);
unsigned int len_encoded = 0;
int data_left = len / 2;
int miss_c = 0, i = 0;
while (data_left > 0)
{
int sz = data_left > cntx_->frame_size ? cntx_->frame_size : data_left;
mp3_frame_->nb_samples = sz;
mp3_frame_->format = cntx_->sample_fmt;
mp3_frame_->channel_layout = cntx_->channel_layout;
int needed_size = av_samples_get_buffer_size(NULL, 1,
mp3_frame_->nb_samples, cntx_->sample_fmt, 1);
int r = avcodec_fill_audio_frame(mp3_frame_, 1, cntx_->sample_fmt, encode_data + len_encoded, needed_size, 0);
int gotted = -1;
r = avcodec_encode_audio2(cntx_, &avpkt, mp3_frame_, &gotted);
if (gotted){
i++;
ret.insert(ret.end(), avpkt.data, avpkt.data + avpkt.size);
}
else if (gotted == 0){
miss_c++;
}
len_encoded += needed_size;
data_left -= sz;
av_free_packet(&avpkt);
}
return ret;
}
std::vector<unsigned char> decode(unsigned char* data, unsigned int len)
{
std::vector<unsigned char> ret;
AVPacket avpkt;
av_init_packet(&avpkt);
avpkt.data = data;
avpkt.size = len;
AVFrame* pframe = av_frame_alloc();
while (avpkt.size > 0){
int goted = -1;av_frame_unref(pframe);
int used = avcodec_decode_audio4(cntx_, pframe, &goted, &avpkt);
if (goted){
ret.insert(ret.end(), pframe->data[0], pframe->data[0] + pframe->linesize[0]);
avpkt.data += used;
avpkt.size -= used;
avpkt.dts = avpkt.pts = AV_NOPTS_VALUE;
}
else if (goted == 0){
avpkt.data += used;
avpkt.size -= used;
avpkt.dts = avpkt.pts = AV_NOPTS_VALUE;
}
else if(goted < 0){
break;
}
}
av_frame_free(&pframe);
return ret;
}
Suppose it's the 100th call to encode(data, len), this "frame" would appear in 150th or later in the decode call, the latency is not acceptable. It seems the mp3lame encoder would keep the sample data for later use, but not my desire. I don't know what is going wrong. Thank you for any information.
today i debug the code again and post some detail:
encode: each pcm sample frame len = 23040 ,which is 10 times of mp3 frame size, each time call encode only output 9 frames, this output cause decode output 20736 samples, 1 frame(2304 bytes) is lost, and the sound is noisy.
if the mp3 or mp2 encode is not suitable for real time voice transfer, which encoder should i choose?