I have an audio file, that have some silences, which I am detecting with ffmpeg detectsilence and then trying to remove with removesilence, however there is some strange behavior. Specifically:
1) File's Basic info based on ffprobe show_streams
Input #0, mp3, from 'my_file.mp3':
Metadata:
encoder : Lavf58.64.100
Duration: 00:00:25.22, start: 0.046042, bitrate: 32 kb/s
Stream #0:0: Audio: mp3, 24000 Hz, mono, fltp, 32 kb/s
2) Using detectsilence
ffmpeg -i my_file.mp3 -af silencedetect=noise=-50dB:d=0.2 -f null -
I get this result
[mp3float @ 000001ee50074280] overread, skip -7 enddists: -1 -1
[silencedetect @ 000001ee5008a1c0] silence_start: 6.21417
[silencedetect @ 000001ee5008a1c0] silence_end: 6.91712 | silence_duration: 0.702958
[silencedetect @ 000001ee5008a1c0] silence_start: 16.44
[silencedetect @ 000001ee5008a1c0] silence_end: 17.1547 | silence_duration: 0.714708
[mp3float @ 000001ee50074280] overread, skip -10 enddists: -3 -3
[mp3float @ 000001ee50074280] overread, skip -5 enddists: -4 -4
[silencedetect @ 000001ee5008a1c0] silence_start: 24.4501
size=N/A time=00:00:25.17 bitrate=N/A speed=1.32e+03x
video:0kB audio:1180kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[silencedetect @ 000001ee5008a1c0] silence_end: 25.176 | silence_duration: 0.725917
That also match the values and points based on Adobe Audition
So far all good.
3) Now, based on some calculations (which is based on application's logic on what should be the final duration of the audio) I am trying to delete the silence with "0.725917"s duration. For that, based on ffmpeg docs (https://ffmpeg.org/ffmpeg-filters.html#silencedetect)
Trim all silence encountered from beginning to end where there is more than 1 second of silence in audio: silenceremove=stop_periods=-1:stop_duration=1:stop_threshold=-90dB
I run this command
ffmpeg -i my_file.mp3 -af silenceremove=stop_periods=-1:stop_threshold=-50dB:stop_duration=0.72 result1.mp3
So, I am expecting that it should delete only the silence with "0.725917" duration (the last one in the above image), however it is deleting the silence that starts at 16.44s with duration of "0.714708"s. Please see the following comparison:
4) Running detectsilence on result1.mp3 with same options gives even stranger results
ffmpeg -i result1.mp3 -af silencedetect=noise=-50dB:d=0.2 -f null -
result
[mp3float @ 0000017723404280] overread, skip -5 enddists: -4 -4
[silencedetect @ 0000017723419540] silence_start: 6.21417
[silencedetect @ 0000017723419540] silence_end: 6.92462 | silence_duration: 0.710458
[mp3float @ 0000017723404280] overread, skip -7 enddists: -6 -6
[mp3float @ 0000017723404280] overread, skip -7 enddists: -2 -2
[mp3float @ 0000017723404280] overread, skip -6 enddists: -1 -1
Last message repeated 1 times
[silencedetect @ 0000017723419540] silence_start: 23.7308
size=N/A time=00:00:24.45 bitrate=N/A speed=1.33e+03x
video:0kB audio:1146kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[silencedetect @ 0000017723419540] silence_end: 24.456 | silence_duration: 0.725167
So, the results are:
- With command to remove silences that are longer than "0.72 second", a silence that was "0.714708"s, got removed and - a silence with "0.725917"s remained as is (well, actually changed a little - as per 3rd point)
- The first silence that had started at "6.21417" and had a duration of "0.702958"s, suddenly now has a duration of "0.710458"s
- The 3rd silence that had started at "24.4501" (which now starts at 23.7308 - obviously because the 2nd silence was removed) and had a duration of "0.725917", now suddenly is "0.725167"s (this one is not a big difference, but still why even removing other silence, this silence's duration should change at all).
Accordingly the expected results are:
- Only the silences that match the provided condition (stop_duration=0.72) should be removed. In this specific example only the last one, but in general any silence that matches the condition of the length - irrelevant of their positioning (start, end or in the middle)
- Other silences should remain with same exact duration they were before
FFMpeg: 4.2.4-1ubuntu0.1, Ubuntu: 20.04.2
Some attempts and results, while playing with ffmpeg options
a)
ffmpeg -i my_file.mp3 -af silenceremove=stop_periods=-1:stop_threshold=-50dB:stop_duration=0.72:detection=peak tmp1.mp3
result: First and second silences are removed, 3rd silence's duration remains exactly the same
b)
ffmpeg -i my_file.mp3 -af silenceremove=stop_periods=-1:stop_threshold=-50dB:stop_duration=0.71 tmp_0.71.mp3
result: First and second silences are removed, 3rd silence remains, but the duration becomes "0.72075"s
c)
ffmpeg -i my_file.mp3 -af silenceremove=stop_periods=-1:stop_threshold=-50dB:stop_duration=0.7 tmp_0.7.mp3
result: all 3 silence are removed
d) the edge case
this command still removes the second silence (after which the first silence become exactly as in point #4 and last silence becomes "0.721375")
ffmpeg -i my_file.mp3 -af silenceremove=stop_periods=-1:stop_threshold=-50dB:stop_duration=0.72335499999 tmp_0.72335499999.mp3
but this one, again does not remove any silence:
ffmpeg -i my_file.mp3 -af silenceremove=stop_periods=-1:stop_threshold=-50dB:stop_duration=0.723355 tmp_0.723355.mp3
e) window param case 0.03
ffmpeg -i my_file.mp3 -af silenceremove=stop_periods=-1:stop_threshold=-50dB:stop_duration=0.72:window=0.03 window_0.03.mp3
does not remove any silence, but the detect silence
ffmpeg -i window_0.03.mp3 -af silencedetect=noise=-50dB:d=0.2 -f null -
gives this result (compare with silences in result1.mp3 - from point #4 )
[mp3float @ 000001c5c8824280] overread, skip -5 enddists: -4 -4
[silencedetect @ 000001c5c883a040] silence_start: 6.21417
[silencedetect @ 000001c5c883a040] silence_end: 6.92462 | silence_duration: 0.710458
[mp3float @ 000001c5c8824280] overread, skip -7 enddists: -6 -6
[mp3float @ 000001c5c8824280] overread, skip -7 enddists: -2 -2
[silencedetect @ 000001c5c883a040] silence_start: 16.4424
[silencedetect @ 000001c5c883a040] silence_end: 17.1555 | silence_duration: 0.713167
[mp3float @ 000001c5c8824280] overread, skip -6 enddists: -1 -1
Last message repeated 1 times
[silencedetect @ 000001c5c883a040] silence_start: 24.4508
size=N/A time=00:00:25.17 bitrate=N/A speed=1.24e+03x
video:0kB audio:1180kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[silencedetect @ 000001c5c883a040] silence_end: 25.176 | silence_duration: 0.725167
f) window case 0.01
ffmpeg -i my_file.mp3 -af silenceremove=stop_periods=-1:stop_threshold=-50dB:stop_duration=0.72:window=0.01 window_0.01.mp3
removes first and second silences, the detect silence with same params has the following result
[mp3float @ 000001ea631d4280] overread, skip -5 enddists: -4 -4
Last message repeated 1 times
[mp3float @ 000001ea631d4280] overread, skip -7 enddists: -2 -2
[mp3float @ 000001ea631d4280] overread, skip -6 enddists: -1 -1
Last message repeated 1 times
[silencedetect @ 000001ea631ea1c0] silence_start: 23.0108
size=N/A time=00:00:23.73 bitrate=N/A speed=1.2e+03x
video:0kB audio:1113kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
[silencedetect @ 000001ea631ea1c0] silence_end: 23.736 | silence_duration: 0.725167
Any thoughts, ideas, points are much appreciated.
window=
anddetection=
? – user3840170