I want to use spectrograms for audio files classification with CNN. The problem is that my audio files have different lengths (between 2 seconds and 17 seconds) and when I generate the spectograms. they all have the same size which means that the spectrum is widen for the shorter audio files. How can I generate the spectograms so that the signal is not altered?
I tried using the matplotlib.pyplot
library for creating the spectrograms but all the images are 640 x 480.
This is the code I used
import matplotlib.pyplot as plt
from scipy.io import wavfile
samplingFrequency, signalData = wavfile.read('dia0_utt0.wav')
plt.title('Spectrogram')
plt.specgram(signalData,Fs=samplingFrequency,NFFT=512)
plt.xlabel('Time')
plt.ylabel('Frequency')
plt.savefig('fig11.png')
I don't know how to obtain spectrograms of variable dimensions based on their length, or to have them of same dimensions but filling the rest until the max length with no information. For example, if I have a 3 seconds file and the max length is 17 seconds then generate the spectrogram for 3 seconds and fill the rest of the spectrogram with no noise to make it of 17 seconds.