I try to encode raw PCM sound to G711A and G711U and then decode it, with this codecs everything works fine because I can choose any value for AVCodecContext frame_size for encoding, but in case of Opus codec the AVCodecContext frame_size is equal to 120, so if I understood correctly if my input data array size is bigger than 120 then I need to do some kind of buffering and split my input data into several parts and then sequentially put it to AVFrame->data and pass the AVFrame to encoding.
In result I get a very bad sound and I get this result not only when I use Opus codec but also in G711 if I set it's AVCodecContext frame_size to some value that will be less than size of my input data.
So my question is: what it the correct way to encode input data if it's size if bigger than AVCodecContext frame_size? Do I need to split my input data into some parts that <= AVCodecContext frame_size if so how should I do that?
At this moment my code looks like this:
void encode(uint8_t *data, unsigned int length)
{
int rawOffset = 0;
int rawDelta = 0;
int rawSamplesCount = frameEncode->nb_samples <= length ? frameEncode->nb_samples : length;
while (rawSamplesCount > 0)
{
memcpy(frameEncode->data[0], &data[rawOffset], sizeof(uint8_t) * rawSamplesCount);
encodeFrame();
rawOffset += rawSamplesCount;
rawDelta = length - rawOffset;
rawSamplesCount = rawDelta > frameEncode->nb_samples ? frameEncode->nb_samples : rawDelta;
}
av_frame_unref(frameEncode);
}
void encodeFrame()
{
/* send the frame for encoding */
int ret = avcodec_send_frame(contextEncoder, frameEncode);
if (ret < 0)
{
LOGE(TAG, "[encodeFrame] avcodec_send_frame error: %s", av_err2str(ret));
return;
}
/* read all the available output packets (in general there may be any number of them) */
while (ret >= 0)
{
ret = avcodec_receive_packet(contextEncoder, packetEncode);
if (ret < 0 && ret != AVERROR(EAGAIN)) LOGE(TAG, "[encodeFrame] error in avcodec_receive_packet: %s", av_err2str(ret));
if (ret < 0) break;
std::pair<uint8_t*, unsigned int> p = std::pair<uint8_t*, unsigned int>();
p.first = (uint8_t *)(malloc(sizeof(uint8_t) * packetEncode->size));
memcpy(p.first, packetEncode->data, (size_t)packetEncode->size);
p.second = (unsigned int)(packetEncode->size);
listEncode.push_back(p); // place encoded data into list to finally create one array of encoded data from it
}
av_packet_unref(packetEncode);
}
You can see that I split my input data into several parts, then I put it in frame->data and then pass the frame to encoding but I'm not sure that is the correct way.
UPD: I noticed that when I use G711 if I set AVCodecContext frame_size to 160 and size of my input data is 160 or 320 everething works fine, but if input data size is 640 then i get bad buzzing sound.