20
votes

(I'm still learning how to handle images in R; this is sort of a continuation of rpart package: Save Decision Tree to PNG )

I'm trying to save a decision tree plot from rpart in PNG form, instead of the provided postscript. My code looks like this:

png("tree.png", width=1000, height=800, antialias="cleartype")
plot(fit, uniform=TRUE, 
   main="Classification Tree")
text(fit, use.n=TRUE, all=TRUE, cex=.8)
dev.off()

but cuts off a little of the labels for the edge nodes on both sides. this isn't a problem in the original post image, which I've converted to png just to check. I've tried using both oma and mar settings in par, which were recommended as solutions for label/text problems, and both added white space around the image but don't show anymore of the labels. Is there any way to get the text to fit?

4
Try reading the documentation contained at ?plot.rpart and pay particular attention to the margin argument.joran
Ah, I didn't know there were quite so many ways to set margins. Thanks!rhae66

4 Answers

18
votes

The rpart.plot package plots rpart trees and automatically takes care of the margin and related issues. Use rpart.plot (instead of plot and text in the rpart package). For example:

library(rpart.plot)
data(ptitanic)
fit <- rpart(survived~., data=ptitanic)
png("tree.png", width=1000, height=800, antialias="cleartype")
rpart.plot(fit, main="Classification Tree")
dev.off()
14
votes

The default margin is 0. So if your text is a set of words or just a long word, try to put more margin in plot call. For example,

plot(fit, uniform=TRUE,margin=0.2)
text(fit, use.n=TRUE, all=TRUE, cex=.8)

Alternatively, you can adjust text font size by changing cex in text call. For example,

plot(fit, uniform=TRUE)
text(fit,use.n=TRUE, all=TRUE, cex=.7)

Of course, you can adjust both mar in plot call and cex in text call to get what you want.

1
votes

On rpart man, at rpart() examples the author gives the solution, set par options with xpd = NA:

par(mfrow = c(1,2), xpd = NA)

otherwise on some devices the text is clipped

0
votes

Problem tiwh titanic dataset is rplot will not join ages and fare to display a nive "age > 10" label. It will display them by extension, like:

age = 11,18,19,22,24,28,29,30,32,33,37,39,40,42,45.5,5,56,58,60...

That makes no room for labels (see the picture)

bad labels

Solution is here: https://community.rstudio.com/t/rpart-result-is-too-small-to-see/60702/4

Basically, you have to mutate age and fare columns into numeric variables. Like:

clean_titanic <- titanic %>% 
  select(-c(home.dest, cabin, name, x, ticket)) %>%
  mutate(
    pclass = factor(pclass, levels = c(1, 2, 3), labels = c('Upper', 'Middle', 'Lower')),
    survived = factor(survived, levels = c(0, 1), labels = c('No', 'Yes')),
    # HERE. Also notice I'm removing dots from numbers
    age = as.numeric(age),
    fare = as.numeric(fare)
  )

That will give you better labels, and room for them in the plot.

One more thing: you could get a warning when you force non numeric values with as.numeric, and there are a couple of ways to solve that, like replacing characters or ignoring the warning. Ignore like:

suppressWarnings(as.numeric(age)))

good plot