0
votes

I am using partykit:ctree to explore my dataset, which is a set of about 15,000 beach surveys, investigating the number of pieces of debris found from 50 different categories. There are lots of zeros in the data, and a large spread of total debris amounts. I also have a series of independent variables, including some factors, some count data, and some continuous data.

Here is a very small sample dataset:

Counts<- as.data.frame(matrix (rpois(100,1), ncol=5))
colnames(Counts)<-c("Glass", "HardPlastic", "SoftPlastic", "PlasticBag", "Fragments")
State<-rep(c("CA","OR","WA"), each=6)
Counts$State<-c(State,"CA","OR")
County<-rep((1:9), each=2)
Counts$County<-c(County, 1,4)
Counts$Distance<-c(10, 15, 13, 19, 18, 23, 38, 40, 49, 44, 47, 45, 52, 53, 55, 59, 51, 53, 14, 33)
Year<-rep(c("2010","2011","2012"), times=7)
Counts$Year<-Year[1:20]

I have used the following code to partition my data:

M.2<-ctree(Glass + HardPlastic + SoftPlastic + PlasticBag + Fragments ~ 
         as.factor (State) + as.factor (County) + Distance + as.factor (Year), data=Counts)
plot(M.2, terminal_panel = node_barplot, cex = 0.5)

This comes up with a lovely graph, but how do I extract the membership of each of the terminal nodes? I can see it in the graph if there are only a few items, but once the number of possible categories increases to 50, it becomes much harder to look at it graphically. I would like to see the information contained within the nodes; particularly the relative probabilities of each individual category being contained in each terminal node.

I know that if this were a BinaryTree class, I could use the nodes argument, but when I query the class(M.2) it tells me it is from the constaparty class, and I haven't been able to find how to get node information from this class.

I have also run into a secondary problem, which is that when I run the ctree on my sample data set, it crashes R every time! It works fine with my actual data set, but I can't figure out what is wrong with the sample set.

EDIT: The desired output would be something along the lines of:

Node15:
Hard Plastic 30
Glass 5
Soft Plastic 23
Plastic Bag 6
Fragments 12

1
Until @Achim sorts this out, could you show your desired output maybe? I've written many similar function across the years that help me to complement the stuff I"m missing from the party package.David Arenburg
Hi @David, I've added a basic possible output...not formatted very prettily, but I guess it gets the point across?Alexandra

1 Answers

1
votes

I just e-mailed with the package maintainer (Torsten Hothorn) and principal author of ctree() to which such requests would really best be directed. (He currently does not participate in SO.) Apparently, this is a bug in the partykit version of ctree() and he is working on resolving this. For the time being it is best to use the old party version for this - and hopefully a fixed partykit version will become available soon.