3
votes

I'm running a ctree method model in caret and trying to plot the decision tree I get. This is the main portion of my code.

fitControl <- trainControl(method = "cv", number = 10)
dtree <- train(
  Outcome ~ ., data = training_set, 
  method = "ctree", trControl = fitControl
)

I'm trying to plot the decision tree and I use

plot(dtree$finalModel)

which gives me this -

decision tree

The picture's not good here but I get an image similar to the first plot in the answer of this question - Plot ctree using rpart.plot functionality

And the function as.simpleparty doesn't work as it is not an rpart object.

I want to remove the bar graphs underneath and simply get a 1 or a 0 on those nodes telling me how it is classified. As the dtree$finalModel is a Binary Tree object,

prp(dtree$finalModel)

doesn't work.

1
Without being familiar with ctree in particular, the fallback is always to capture the output plot object and manually manipulate it to remove the unwanted bits. If you find an elegant way to do that please post it here and submit it to the package maintainers.smci
Hi AVT Welcome to stackoverflow How can I improve my answer to get an upvote or it marked as accepted?makeyourownmaker
@makeyourownmaker Apologies. Absolutely forgot about it. Done now.AVT
No worries Thank you for accepting my answermakeyourownmaker

1 Answers

1
votes

It's possible to get a ctree plot without the graphs at the bottom but with the outcome labels without using caret. I've included the caret code below for completeness though.

First setup some data for a reproducible example:

library(caret)    
library(partykit)
data("PimaIndiansDiabetes", package = "mlbench")
head(PimaIndiansDiabetes)
      pregnant glucose pressure triceps insulin mass pedigree age diabetes
1        6     148       72      35       0 33.6    0.627  50      pos
2        1      85       66      29       0 26.6    0.351  31      neg
3        8     183       64       0       0 23.3    0.672  32      pos
4        1      89       66      23      94 28.1    0.167  21      neg
5        0     137       40      35     168 43.1    2.288  33      pos
6        5     116       74       0       0 25.6    0.201  30      neg

Now find optimal ctree parameters using caret:

fitControl <- trainControl(method = "cv", number = 10)
dtree <- train(
  diabetes ~ ., data = PimaIndiansDiabetes, 
  method = "ctree", trControl = fitControl
)

dtree
Conditional Inference Tree

768 samples
  8 predictor
  2 classes: 'neg', 'pos'

No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 691, 691, 691, 692, 691, 691, ...
Resampling results across tuning parameters:

  mincriterion  Accuracy   Kappa
  0.01          0.7239747  0.3783882
  0.50          0.7447027  0.4230003
  0.99          0.7525632  0.4198104

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was mincriterion = 0.99.

This is not an ideal model, but hey ho and on we go.

Now build and plot a ctree model using the ctree package with optimal parameters from caret:

ct <- ctree(diabetes ~ ., data = PimaIndiansDiabetes, mincriterion = 0.99)

png("diabetes.ctree.01.png", res=300, height=8, width=14, units="in")
plot(as.simpleparty(ct))
dev.off()

Which gives the following figure without the graphs at the bottom but with the outcome variables ("pos" and "neg") on terminal nodes. It's necessary to use non-default height and width values to avoid overlapping terminal nodes.

Diabetes ctree diagram

Note, care should be taken with 0, 1 outcome variables when using ctree with caret. The caret package with the ctree method defaults to building a regression model with integer or numeric 0, 1 data. Convert the outcome variable to a factor if classification is required.