3
votes

I am new learner of audio editing libs - Pydub. I want to change some audio files' playback speed using Pydub(say .wav/mp3 format files), but I don't know how to make it. The only module I saw that could possibly deal with this problem is speedup module in effect.py. However, there is no explanation about how I am supposed to call it.

Could anyone kindly explain how to do this task in Pydub? Many thanks!

(A related question: Pydub - How to change frame rate without changing playback speed, but what I want to do is to change the playback speed without changing the audio quality.)

4
Specifically, I don't understand the arguments in speedup(seg, playback_speed=1.5, chunk_size=150, crossfade=25), especially chunk_size and crossfade. Could anyone kindly explain them? Thanks!itsMe
Chunk size and crossfade are optional - you can just leave them out :) - a playback speed of 1.5 will play 1.5x faster than the original sound. The way it works is it splits the sound into chunks (150ms long by default) and then overlaps them to shorten the total playback duration (with crossfades of 25ms, be default)Jiaaro

4 Answers

5
votes

sound.set_frame_rate() does a conversion, it should not cause any "chipmunk effect", but what you can do is change the frame rate (without a conversion) and then convert the audio from there back to a normal frame rate (like 44.1 kHz, "CD quality")

from pydub import AudioSegment
sound = AudioSegment.from_file(…)

def speed_change(sound, speed=1.0):
    # Manually override the frame_rate. This tells the computer how many
    # samples to play per second
    sound_with_altered_frame_rate = sound._spawn(sound.raw_data, overrides={
         "frame_rate": int(sound.frame_rate * speed)
      })
     # convert the sound with altered frame rate to a standard frame rate
     # so that regular playback programs will work right. They often only
     # know how to play audio at standard frame rate (like 44.1k)
    return sound_with_altered_frame_rate.set_frame_rate(sound.frame_rate)


slow_sound = speed_change(sound, 0.75)
fast_sound = speed_change(sound, 2.0)
3
votes
from pydub import AudioSegment
from pydub import effects

root = r'audio.wav'
velocidad_X = 1.5 # No puede estar por debajo de 1.0

sound = AudioSegment.from_file(root)
so = sound.speedup(velocidad_X, 150, 25)
so.export(root[:-4] + '_Out.mp3', format = 'mp3')
1
votes

This can be done using pyrubberband package which requires rubberband library that can stretch audio while keeping the pitch and high quality. I was able to install the library on MacOS using brew, and same on Ubuntu with apt install. For extreme stretching, look into PaulStretch

brew install rubberband

This works simply with librosa package

import librosa
import pyrubberband
import soundfile as sf

y, sr = librosa.load(filepath, sr=None)
y_stretched = pyrubberband.time_stretch(y, sr, 1.5)
sf.write(analyzed_filepath, y_stretched, sr, format='wav')

To make pyrubberband work directly with AudioSegment from pydub without librosa I fiddled this function:

def change_audioseg_tempo(audiosegment, tempo, new_tempo):
    y = np.array(audiosegment.get_array_of_samples())
    if audiosegment.channels == 2:
        y = y.reshape((-1, 2))

    sample_rate = audiosegment.frame_rate

    tempo_ratio = new_tempo / tempo
    print(tempo_ratio)
    y_fast = pyrb.time_stretch(y, sample_rate, tempo_ratio)

    channels = 2 if (y_fast.ndim == 2 and y_fast.shape[1] == 2) else 1
    y = np.int16(y_fast * 2 ** 15)

    new_seg = pydub.AudioSegment(y.tobytes(), frame_rate=sample_rate, sample_width=2, channels=channels)

    return new_seg
1
votes

I know it's late but I wrote a program to convert mp3 to different playback speed.

First, Convert the .MP3 -> .Wav because PYRubberBand supports only .wav format. Then streach the time and pitch at the same time to avoid chipmunk effect.

import wave
import sys
from pydub import AudioSegment
#sound = AudioSegment.from_file("deviprasadgharpehai.mp3")
sound = AudioSegment.from_mp3(sys.argv[1])
sound.export("file.wav", format="wav")

print(sys.argv[1])

import soundfile as sf
import pyrubberband as pyrb
y, sr = sf.read("file.wav")
# Play back at extra low speed
y_stretch = pyrb.time_stretch(y, sr, 0.5)
# Play back extra low tones
y_shift = pyrb.pitch_shift(y, sr, 0.5)
sf.write("analyzed_filepathX5.wav", y_stretch, sr, format='wav')

sound = AudioSegment.from_wav("analyzed_filepathX5.wav")
sound.export("analyzed_filepathX5.mp3", format="mp3")

# Play back at low speed
y_stretch = pyrb.time_stretch(y, sr, 0.75)
# Play back at low tones
y_shift = pyrb.pitch_shift(y, sr, 0.75)
sf.write("analyzed_filepathX75.wav", y_stretch, sr, format='wav')

sound = AudioSegment.from_wav("analyzed_filepathX75.wav")
sound.export("analyzed_filepathX75.mp3", format="mp3")

# Play back at 1.5X speed
y_stretch = pyrb.time_stretch(y, sr, 1.5)
# Play back two 1.5x tones
y_shift = pyrb.pitch_shift(y, sr, 1.5)
sf.write("analyzed_filepathX105.wav", y_stretch, sr, format='wav')

sound = AudioSegment.from_wav("analyzed_filepathX105.wav")
sound.export("analyzed_filepathX105.mp3", format="mp3")

# Play back at same speed
y_stretch = pyrb.time_stretch(y, sr, 1)
# Play back two smae-tones
y_shift = pyrb.pitch_shift(y, sr, 1)
sf.write("analyzed_filepathXnormal.wav", y_stretch, sr, format='wav')

sound = AudioSegment.from_wav("analyzed_filepathXnormal.wav")
sound.export("analyzed_filepathXnormal.mp3", format="mp3")

**Make Sure to install **

Wave, AudioSegment, FFmpeg, PYRubberBand, Soundfile

To use this Run,

python3 filename.py mp3filename.mp3