0
votes

Thanks for any help - I'm building a decision tree in R, and the classic example is

iris_ctree <- ctree(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, data=iris)

My question is: what if I wanted to enter a variable number of parameters, say instead of pre-ordaining Sepal.Length + Sepal.Width and Petal.Length, it was

Flowervar1, Flowervar2, Flowervar3, etc. What if I don't know the number of independent variables until the program is run, how do I pass that into the formula?

1
How would you "know" which variables you wanted to include? It would help if you had a more concrete reproducible exampleMrFlick
So in this situation, I've got hundreds of possible independent variables, and I want to offer the user the chance to pick how many of them to include - so what if for the flower example, there were 40 possible measurements, and you could choose at runtime how many would be used in the making of the tree. Does that help?Bacter
How are you going to interact with a user at run time with R? What's the input you plan to pass to the modeling function?MrFlick
The particulars of the situation are: I'm trying to classify text files into one of two categories. I've got several metrics for picking out the "best" words to classify the documents, and one of the specifications is that the user should be able to specify how many of the top words are used to classify. Therefore, I can't really hard-code a number of independent variables into the decision tree, I've got to go with the number specified.Bacter
I don't think the formula is ACTUALLY calling on me to sum the columns with "independent variable 1 + independent variable 2 + ...", right?Bacter

1 Answers

0
votes

Based on the excellent recommendations of MrFlick, I found it!

listofintfactors <- c(paste("df", 1:iterations-1, sep = ""))

form <- as.formula(paste("df[,ncol(df)]~", paste (listofintfactors[-1], collapse="+")))

got me the exact formula I needed, and I was then just able to plug that in to the decision tree, and hey presto away it went. "iterations" is the number of variables I'm testing, so this now works for any user-input number.

So: Thanks, MrFlick!