I'm using Decision tree classifier from sklearn, but I'm getting 100% percent score and I don't know what is wrong. I have tested svm and knn and both give 60% to 80% accuracy and seem ok. Here is my code:
from sklearn.tree import DecisionTreeClassifier
maxScore = 0
index = 0
Depths = [1, 5, 10, 20, 40]
for i,d in enumerate(Depths):
clf1 = DecisionTreeClassifier(max_depth=d)
score = cross_val_score(clf1, X_train, Y_train, cv=10).mean()
index = i if(score > maxScore) else index
maxScore = max(score, maxScore)
print('The cross val score for Decision Tree classifier (max_depth=' + str(d) + ') is ' +
str(score))
d = Depths[index]
print()
print("So the best value for max_depth parameter is " + str(d))
print()
# Classifying
clf1 = DecisionTreeClassifier(max_depth=d)
clf1.fit(X_train, Y_train)
preds = clf1.predict(X_valid)
print(" The accuracy obtained using Decision tree classifier is {0:.8f}%".format(100*
(clf1.score(X_valid, Y_valid))))
and here is the output: The cross val score for Decision Tree classifier (max_depth=1) is 1.0
The cross value score for Decision Tree classifier (max_depth=5) is 0.9996212121212121
The cross val score for Decision Tree classifier (max_depth=10) is 1.0
The cross val score for Decision Tree classifier (max_depth=20) is 1.0
The cross val score for Decision Tree classifier (max_depth=40) is 0.9996212121212121
So the best value for the max_depth parameter is 1
The accuracy obtained using Decision tree classifier is 100.00000000%