1
votes

From the examples I got the basic idea of this code. However I am not sure, what I am missing, as muxing.c demuxing.c and decoding_encoding.c all use different approaches.

The process of converting an audio file to another file should go roughly like this: inputfile -demux-> audiostream -read-> inPackets -decode2frames-> frames -encode2packets-> outPackets -write-> audiostream -mux-> outputfile

However I found the following comment in demuxing.c: /* Write the raw audio data samples of the first plane. This works
* fine for packed formats (e.g. AV_SAMPLE_FMT_S16). However,
* most audio decoders output planar audio, which uses a separate
* plane of audio samples for each channel (e.g. AV_SAMPLE_FMT_S16P).
* In other words, this code will write only the first audio channel
* in these cases.
* You should use libswresample or libavfilter to convert the frame
* to packed data. */

My questions about this are:

  1. Can I expect a frame that was retrieved by calling one of the decoder functions, f.e. avcodec_decode_audio4 to hold suitable values to directly put it into an encoder or is the resampling step mentioned in the comment mandatory?

  2. Am I taking the right approach? ffmpeg is very asymmetric, i.e. if there is a function open_file_for_input there might not be a function open_file_for_output. Also there are different versions of many functions (avcodec_decode_audio[1-4]) and different naming schemes, so it's very hard to tell, if the general approach is right, or actually an ugly mixture of techniques that where used at different version bumps of ffmpeg.

  3. ffmpeg uses a lot of specific terms, like 'planar sampling' or 'packed format' and I am having a hard time, finding definitions for these terms. Is it possible to write working code, without deep knowledge of audio?

Here is my code so far that right now crashes at avcodec_encode_audio2 and I don't know why.

int Java_com_fscz_ffmpeg_Audio_convert(JNIEnv * env, jobject this, jstring jformat, jstring jcodec, jstring jsource, jstring jdest) {
    jboolean isCopy;
    jclass configClass = (*env)->FindClass(env, "com.fscz.ffmpeg.Config");
    jfieldID fid = (*env)->GetStaticFieldID(env, configClass, "ffmpeg_logging", "I");
    logging = (*env)->GetStaticIntField(env, configClass, fid);

    /// open input
    const char* sourceFile = (*env)->GetStringUTFChars(env, jsource, &isCopy);
    AVFormatContext* pInputCtx;
    AVStream* pInputStream;
    open_input(sourceFile, &pInputCtx, &pInputStream);

    // open output
    const char* destFile = (*env)->GetStringUTFChars(env, jdest, &isCopy);
    const char* cformat = (*env)->GetStringUTFChars(env, jformat, &isCopy);
    const char* ccodec = (*env)->GetStringUTFChars(env, jcodec, &isCopy);
    AVFormatContext* pOutputCtx;
    AVOutputFormat* pOutputFmt;
    AVStream* pOutputStream;
    open_output(cformat, ccodec, destFile, &pOutputCtx, &pOutputFmt, &pOutputStream);

    /// decode/encode
    error = avformat_write_header(pOutputCtx, NULL);
    DIE_IF_LESS_ZERO(error, "error writing output stream header to file: %s, error: %s", destFile, e2s(error));

    AVFrame* frame = avcodec_alloc_frame();
    DIE_IF_UNDEFINED(frame, "Could not allocate audio frame");
    frame->pts = 0;

    LOGI("allocate packet");
    AVPacket pktIn;
    AVPacket pktOut;
    LOGI("done");
    int got_frame, got_packet, len, frame_count = 0;
    int64_t processed_time = 0, duration = pInputStream->duration;
    while (av_read_frame(pInputCtx, &pktIn) >= 0) {
        do {
            len = avcodec_decode_audio4(pInputStream->codec, frame, &got_frame, &pktIn);
            DIE_IF_LESS_ZERO(len, "Error decoding frame: %s", e2s(len));
            if (len < 0) break;
            len = FFMIN(len, pktIn.size);
            size_t unpadded_linesize = frame->nb_samples * av_get_bytes_per_sample(frame->format);
            LOGI("audio_frame n:%d nb_samples:%d pts:%s\n", frame_count++, frame->nb_samples, av_ts2timestr(frame->pts, &(pInputStream->codec->time_base)));
            if (got_frame) {
                do {
                    av_init_packet(&pktOut);
                    pktOut.data = NULL;
                    pktOut.size = 0;
                    LOGI("encode frame");
                    DIE_IF_UNDEFINED(pOutputStream->codec, "no output codec");
                    DIE_IF_UNDEFINED(frame->nb_samples, "no nb samples");
                    DIE_IF_UNDEFINED(pOutputStream->codec->internal, "no internal");
                    LOGI("tests done");
                    len = avcodec_encode_audio2(pOutputStream->codec, &pktOut, frame, &got_packet);
                    LOGI("encode done");
                    DIE_IF_LESS_ZERO(len, "Error (re)encoding frame: %s", e2s(len));
                } while (!got_packet);
                // write packet;
                LOGI("write packet");
                /* Write the compressed frame to the media file. */
                error = av_interleaved_write_frame(pOutputCtx, &pktOut);
                DIE_IF_LESS_ZERO(error, "Error while writing audio frame: %s", e2s(error));
                av_free_packet(&pktOut);
            }
            pktIn.data += len;
            pktIn.size -= len;
        } while (pktIn.size > 0);
        av_free_packet(&pktIn);
    }

    LOGI("write trailer");
    av_write_trailer(pOutputCtx);
    LOGI("end");

    /// close resources
    avcodec_free_frame(&frame);
    avcodec_close(pInputStream->codec);
    av_free(pInputStream->codec);
    avcodec_close(pOutputStream->codec);
    av_free(pOutputStream->codec);
    avformat_close_input(&pInputCtx);
    avformat_free_context(pOutputCtx);

    return 0;
}
1
is this the process to compress videos inputfile -demux-> audiostream -read-> inPackets -decode2frames-> frames -encode2packets-> outPackets -write-> audiostream -mux-> outputfile - Mr.G

1 Answers

0
votes

Meanwhile I have figured this out and written an Android Library Project that does this (for audio files). https://github.com/fscz/FFmpeg-Android

See the file /jni/audiodecoder.c for details