0
votes

I want to create a Numpy array or arrays, where each sub array has the shape [128, audio_length, 1], so I can feed this np array into Keras.fit. However I cannot seem to figure out how to do this as np.array just throws cannot broadcast error

def prepare_data(df, config, data_dir, bands=128):
    log_specgrams_2048 = []
    for i, fname in enumerate(df.index):
        file_path = data_dir + fname
        data, _ = librosa.core.load(file_path, sr=config.sampling_rate, res_type="kaiser_fast")
        melspec = librosa.feature.melspectrogram(data, sr=config.sampling_rate, n_mels=bands)
        logspec = librosa.core.power_to_db(melspec)  # shape would be [128, your_audio_length]
        logspec = logspec[..., np.newaxis]  # shape will be [128, your_audio_length, 1]
        log_specgrams_2048.append(normalize_data(logspec))
    return log_specgrams_2048
1
Which np.array call throws this error? I don't see one in your code. Do you know for sure that Keras.fit will work with an object dtype array containing arrays of varying size? - hpaulj
Hi hpaulj, when i try and convert the log_specgrams_2048 to an np it throws the error then. I think that Keras will work with varying input size as I've built an FCN where the input is inp = Input(shape=(None, None, 1)) - Henry Hargreaves

1 Answers

0
votes

You have to group sequences by length and call Keras.fit multiple times.

You can do:

  • Bucketing
  • Zero-padding
  • Batch of size 1