1
votes

I am learning the Random Forest Regression Model. I know that it forms many Trees(models) and then we can predict our target variables by averaging the result of all Trees. I also have a descent understanding of Decision Tree Regression Algorithm. How can we form the best number of Trees?

For example i have a dataset where i am predicting person salary and i have only two input variables that are 'Years of Experience', 'Performance Score ' then how many random Trees can i form using such dataset? Are Random Forest Trees dependent upon the number of input variables? Any Good Example will highly be appreciated..

Thanks in Advance

1
Why you tag it "deep learning"?David Dale
Question has nothing to do with deep-learning - kindly do not spam irrelevant tags (removed & replaced with random-forest)desertnaut

1 Answers

0
votes

A decision tree trains the model on the entire dataset and only one model is created. In random forest, multiple decision trees are created and each decision tree is trained on a subset of data by limiting the number of rows and the features. In your case, you only have two features so the model will create and train data on subset of data.

You can create any number of random trees for your data. Usually in random forest, more trees result in better performance but also more computation time. Experiment with your data and see the performance changes between different number of trees. If performance remains same, then use less trees to have faster computation. You can use grid search for this.

Also you can experiment with other ml models like linear regression, which migh† perform well in your case.