I have a 2D numpy array of a audio spectrogram and I want to save it as an image.
I'm using librosa library to get the spectrum. And I can also plot it using librosa.display.specshow()
function. There are number of different scaling types as you can see below.
import PIL
import librosa
import librosa.display
def display_spectrogram(spectrum, sampling_rate):
"""
Frequency types:
‘linear’, ‘fft’, ‘hz’ : frequency range is determined by the FFT window and sampling rate.
‘log’ : the spectrum is displayed on a log scale.
‘mel’ : frequencies are determined by the mel scale.
‘cqt_hz’ : frequencies are determined by the CQT scale.
‘cqt_note’ : pitches are determined by the CQT scale.
"""
librosa.display.specshow(spectrum, sr=sampling_rate, x_axis='time', y_axis='log')
plt.colorbar(format='%+2.0f dB')
plt.title('Spectrogram')
plt.show()
I can also transform the spectrogram (a numpy array) to an image and save like below.
img = PIL.Image.fromarray(spectrum)
img.save("out.png")
I have the original spectrogram (linear scaled) and I want to save it with y-axis in log scale. I looked into the library's source code in order to understand how it scaled but cannot figure it out.
How can I log scale the y-axis of an image / 2D numpy array ?