3
votes

I use oboe to play sounds in my ndk library, and I use OpenSL with Android extensions to decode wav files into PCM. Decoded signed 16-bit PCM are stored in-memory (std::forward_list<int16_t>), and then they are sent into the oboe stream via a callback. The sound that I can hear from my phone is alike original wav file in volume level, however, 'quality' of such a sound is not -- it bursting and crackle.

I am guessing that I send PCM in audio stream in wrong order or format (sampling rate ?). How can I can use OpenSL decoding with oboe audio stream ?


To decode files to PCM, I use AndroidSimpleBufferQueue as a sink, and AndroidFD with AAssetManager as a source:

// Loading asset
AAsset* asset = AAssetManager_open(manager, path, AASSET_MODE_UNKNOWN);
off_t start, length;
int fd = AAsset_openFileDescriptor(asset, &start, &length);
AAsset_close(asset);

// Creating audio source
SLDataLocator_AndroidFD loc_fd = { SL_DATALOCATOR_ANDROIDFD, fd, start, length };
SLDataFormat_MIME format_mime = { SL_DATAFORMAT_MIME, NULL, SL_CONTAINERTYPE_UNSPECIFIED };
SLDataSource audio_source = { &loc_fd, &format_mime };

// Creating audio sink
SLDataLocator_AndroidSimpleBufferQueue loc_bq = { SL_DATALOCATOR_ANDROIDSIMPLEBUFFERQUEUE, 1 };
SLDataFormat_PCM pcm = {
    .formatType = SL_DATAFORMAT_PCM,
    .numChannels = 2,
    .samplesPerSec = SL_SAMPLINGRATE_44_1,
    .bitsPerSample = SL_PCMSAMPLEFORMAT_FIXED_16,
    .containerSize = SL_PCMSAMPLEFORMAT_FIXED_16,
    .channelMask = SL_SPEAKER_FRONT_LEFT | SL_SPEAKER_FRONT_RIGHT,
    .endianness = SL_BYTEORDER_LITTLEENDIAN
};
SLDataSink sink = { &loc_bq, &pcm };

And then I register callback, enqueue buffers and move PCM from buffer to storage until it's done.

NOTE: wav audio file is also 2 channeled signed 16 bit 44.1Hz PCM

My oboe stream configuration is the same:

AudioStreamBuilder builder;
builder.setChannelCount(2);
builder.setSampleRate(44100);
builder.setCallback(this);
builder.setFormat(AudioFormat::I16);
builder.setPerformanceMode(PerformanceMode::LowLatency);
builder.setSharingMode(SharingMode::Exclusive);

Audio rendering is working like that:

// Oboe stream callback
audio_engine::onAudioReady(AudioStream* self, void* audio_data, int32_t num_frames) {
    auto stream = static_cast<int16_t*>(audio_data);
    sound->render(stream, num_frames);
}

// Sound::render method
sound::render(int16_t* audio_data, int32_t num_frames) {
    auto iter = pcm_data.begin();
    std::advance(iter, cur_frame);

    const int32_t rem_size = std::min(num_frames, size - cur_frame);
    for(int32_t i = 0; i < rem_size; ++i, std::next(iter), ++cur_frame) {
        audio_data[i] += *iter;
    }
}
2

2 Answers

3
votes

It looks like your render() method is confusing samples and frames. A frame is a set of simultaneous samples. In a stereo stream, each frame has TWO samples.

I think your iterator works on a sample basis. In other words next(iter) will advance to the next sample, not the next frame. Try this (untested) code.

sound::render(int16_t* audio_data, int32_t num_frames) {
    auto iter = pcm_data.begin();
    const int samples_per_frame = 2; // stereo
    std::advance(iter, cur_sample);

    const int32_t num_samples = std::min(num_frames * samples_per_frame,
              total_samples - cur_sample);
    for(int32_t i = 0; i < num_samples; ++i, std::next(iter), ++cur_sample) {
        audio_data[i] += *iter;
    }
}
0
votes

In short: essentially, I was experiencing an underrun, because of usage of std::forward_list to store PCM. In such a case (using iterators to retrieve PCM), one has to use a container whose iterator implements LegacyRandomAccessIterator (e.g. std::vector).


I was sure that the linear complexity of methods std::advance and std::next doesn't make any difference there in my sound::render method. However, when I was trying to use raw pointers and pointer arithmetic (thus, constant complexity) with debugging methods that were suggested in the comments (Extracting PCM from WAV with Audacity, then loading this asset with AAssetManager directly into memory), I realized, that amount of "corruption" of output sound was directly proportional to the position argument in std::advance(iter, position) in render method.

So, if the amount of sound corruption was directly proportional to the complexity of std::advance (and also std::next), then I have to make the complexity constant -- by using std::vector as an container. And using an answer from @philburk, I got this as a working result:

class sound {
    private:
        const int samples_per_frame = 2; // stereo
        std::vector<int16_t> pcm_data;
        ...
    public:
        render(int16_t* audio_data, int32_t num_frames) {
            auto iter = std::next(pcm_data.begin(), cur_sample);
            const int32_t s = std::min(num_frames * samples_per_frame,
                                       total_samples - cur_sample);

            for(int32_t i = 0; i < s; ++i, std::advance(iter, 1), ++cur_sample) {
                audio_data[i] += *iter;
            }
        }
}