0
votes

Build a Decision tree Regressor model from X_train set and Y_train labels, with default parameters. Name the model as dt_reg.

Evaluate the model accuracy on the training data set and print its score.

Evaluate the model accuracy on the testing data set and print its score.

Predict the housing price for the first two samples of the X_test set and print them.(Hint : Use predict() function)

Fit multiple Decision tree regressors on X_train data and Y_train labels with max_depth parameter value changing from 2 to 5.

Evaluate each model's accuracy on the testing data set.

Hint: Make use of for loop

Print the max_depth value of the model with the highest accuracy.

import sklearn.datasets as datasets
from sklearn.model_selection import train_test_split 
from sklearn.tree import DecisionTreeRegressor
import numpy as np
np.random.seed(100) 
boston = datasets.load_boston()
X_train, X_test, Y_train, Y_test = train_test_split(boston.data, boston.target, random_state=30)
print(X_train.shape)
print(X_test.shape)

dt_reg = DecisionTreeRegressor()   
dt_reg = dt_reg.fit(X_train, Y_train) 
print(dt_reg.score(X_train,Y_train))
print(dt_reg.score(X_test,Y_test))
y_pred=dt_reg.predict(X_test[:2])
print(y_pred)

I want to get Print the max_depth value of the model with the highest accuracy. But fresco plays not submitted Let me know what is error.

max_reg = None
max_score = 0  
t=()
for m in range(2, 6) :
    rf_reg = DecisionTreeRegressor(max_depth=m)
    rf_reg = rf_reg.fit(X_train, Y_train) 
    rf_reg_score = rf_reg.score(X_test,Y_test)
    print (m, rf_reg_score ,max_score) 
    if rf_reg_score > max_score :
        max_score = rf_reg_score
        max_reg = rf_reg
        t = (m,max_score) 
print (t)
2

2 Answers

1
votes

If you wish to continue to use the loop as you've done, you can create another variable called 'best_max_depth' and replace its value with dt_reg.max_depth if your if-statement condition is met (it being the best model so far).

I suggest however, you look into GridSearchCV to extract parameters from your best models and to loop through different parameter values.

max_reg = None
max_score = 0  
best_max_depth = None
t=()
for m in range(2, 6) :
    rf_reg = DecisionTreeRegressor(max_depth=m)
    rf_reg = rf_reg.fit(X_train, Y_train) 
    rf_reg_score = rf_reg.score(X_test,Y_test)
    print (m, rf_reg_score ,max_score) 
    if rf_reg_score > max_score :
        max_score = rf_reg_score
        max_reg = rf_reg
        
        best_max_depth = rf_reg.max_depth
        
        t = (m,max_score) 
print (t)
0
votes

Try this code -

myList = list(range(2,6))
scores =[]
for i in myList:
  dt_reg = DecisionTreeRegressor(max_depth=i)
  dt_reg.fit(X_train,Y_train)
  scores.append(dt_reg.score(X_test, Y_test))
print(myList[scores.index(max(scores))])