2
votes

Using MS Media Foundation, I'm attempting to create a video (H.264/AAC) from image frames, and add an audio track consisting of sound effects at various places. There will be gaps in the audio stream between sound effects. I'm using an IMFSinkWriter configured with an audio and video stream (details below). I'm currently testing with just a single sound effect placed 2 seconds into the video. The MP4 file renders without error, and plays correctly (sound effect plays at correct location) under Windows (via Windows Media Player or "Movies & TV"). However, when I play the video under MacOS (QuickTime), the audio is not synced correctly. The sound effect occurs much earlier than expected.

Details

My SinkWriter is configured with a video stream, with output subtype MFVideoFormat_H264, and input subtype MFVideoFormat_RGB32. The audio stream is configured with output subtype MFAudioFormat_AAC and input subtype MFAudioFormat_PCM (matching the IMFSourceReader providing the audio samples).

I write all the video frames first, and then write the audio samples. When writing the audio, I use SendStreamTick (every 0.5 seconds) when there are gaps in the audio, both before and after the sound effect. I also set MFSampleExtension_Discontinuity on the first audio sample. I also tried sending NotifyEndOfSegment after the sound effect, but that didn't seem to make a difference.

I don't write a sample description box because I believe that it is auto-generated for my configuration.

Any help would be appreciated. Thanks!

2
Open in the suitable video editor and check timing for video and audio tracks. May be you have more frames than sync audio track. Also check frame duration.Evgeny Pereguda

2 Answers

2
votes

The MP4 file renders without error, and plays correctly (sound effect plays at correct location) under Windows (via Windows Media Player or "Movies & TV"). However, when I play the video under MacOS (QuickTime), the audio is not synced correctly. The sound effect occurs much earlier than expected.

Different players handle track gaps differently and quite so often they fail to maintain good sync between tracks. More confusingly, they do it in different ways: some skip the gap while staying in sync, other keep playing "master" track smoothly while ignoring the gap on another track.

That is, even if a file is created with correct data timings, it might so happen and it does happen that players fail to play it well.

The best strategy to produce files well-playable on all players is to avoid gaps in video and audio track data. For audio, encoding artificial silence is a good solution.

1
votes

I came up with a solution that seems to work just fine. My solution was to write silence (zeros) to the audio stream, instead of using SetStreamTick.