Normalizing data for prediction

Question

I'm sorry I'm new to machine learning concept but i'm trying to make a song genre classifier so I have trained my model after normalizing the data using the min_max as so after getting all the features from a csv file

X = data.drop(data.columns[len(data.columns)-1], axis=1, inplace=True)
X = data.values #returns a numpy array
myscaler = preprocessing.MinMaxScaler()
x_scaled = myscaler.fit_transform(X)
X = pd.DataFrame(x_scaled)

and trained the model but now in the prediction phase and i wanna predict the genre of a new song so i got the song and went through the process i used for the feature extraction in the training I'm not sure if i should normalize this new data or not when i didnt normalize it i kept getting same prediction and i tried to normalize it first i was getting wrong shape for the model and then i tried to reshape it but i think i'm still not getting the same result as the normalization from the training seeing that even predicting a song that is in the training dataset gives wrong prediction and i'm sure that my model is correct with 0.8 accuracy

scaler = StandardScaler()
song = np.array(make_dataset_ml("C:\\Users\\USER\\Desktop\\sem8\\AI\\project\\try\\disco.mp3")).reshape(-1,1)
myscaler = preprocessing.MinMaxScaler()
scaled_song = myscaler.fit_transform(song)
song = pd.DataFrame(scaled_song.reshape(1,-1))
prediction = model.predict(song)

this is the only way i could get the correct shape after normalizing and the make_dataset_ml function is the function that returns the features like the trained ones

Igor Rivin Igor Rivin · Accepted Answer · 2020-06-03T00:43:24

You are defining a NEW MinMaxScaler for each song, so this will produce garbage. You should use the previous scaler you defined at training phase.

Normalizing data for prediction

1 Answers