Why my decision tree have not node? (C5.0)

Question

I have 204 data with 6 atribute.

When i create the model with all data with this script, model = C5.0(dataset1[,-7], dataset1[,7]), the result give me no node like the picture below.

But, if i just use 100 data with this script, model = C5.0(dataset1[1:100,-7], dataset1[1:100,7]), the result give me a good decision tree like the picture below.

What is the problem? Is the problem in the data? Thank you.

G5W G5W · Accepted Answer · 2020-02-19T13:53:51

Examining the display of your trees, it is easy to see that what happened. The second model using only 100 points is NOT a better model that the first. When you gave C5.0 more data, it correctly determined that a simpler model was superior. Look at the results.

The first tree (with all 204 points) predicts everything is Lancar giving an error rate of 27% (55 errors out of 204).

What is the error rate for the second tree?

Node 2 predicts Lancar for 55 points with 25.5% errors (14 errors).
Node 4 predicts Lancar for 25 points with 28.0% errors ( 7 errors).
Node 6 predicts Macet for 8 points with 50.0% errors ( 4 errors).
Node 7 predicts Macet for 12 points with 41.7% errors ( 5 errors).
Total errors 30 out of 100 or 30.0% - worse than the 27% error rate for the simpler model. C5.0 simply determined that the best model available was to predict that all points are in the majority class (Lancar).

Why my decision tree have not node? (C5.0)

1 Answers