I am trying to divide the dataset to training and testing set, in below code, df_min_max_scaled is my normalized data, df is my unnormalized data, but I am getting error
import numpy as np
train_ind = df.sample(frac=0.65, replace=True)
train = df_min_max_scaled[train_ind,]
test = df_min_max_scaled[-train_ind,]
train_labels = df[train_ind, 12]
test_labels = df[-train_ind, 12]
#train_labels
Error:
TypeError Traceback (most recent call last)
<ipython-input-50-a640d18b42fc> in <module>
1 import numpy as np
2 train_ind = df.sample(frac=0.65, replace=True)
----> 3 train = df_min_max_scaled[train_ind,]
4 test = df_min_max_scaled[-train_ind,]
5 train_labels = df[train_ind, 12]
Its showing error on line-3, I am actually converting R code to Python using Pandas
train_ind = sample(nrow(wine), floor(0.65 * nrow(wine)))
train = wine2[train_ind,]
test = wine2[-train_ind,]
train_labels = wine[train_ind, 12]
test_labels = wine[-train_ind, 12]
train_indin pandas is already the train dataset. Probably you needtrain_ind = np.random.choice(wine.shape[0], np.floor(0.65 * wine.shape[0]))- onyambu