2
votes

How can I access a column by using a variable that contains the name of the column?

Let's assume we have a data frame DF with 3 columns: Var1 Var2 Var3, where Var3 contains numerical data and Var1 as well as Var2 contain a few factors.

We would like to produce 2 boxplots using a temporary variable that contains the name of the column:

temp<-"Var3"
boxplot(DF[temp]) #(<--that works).

If I use the same method to obtain a boxplot for each factor in Var2, it doesn't:

boxplot(DF[temp]~DF$Var2) #(<-- does not work).

How can I get this working?

Annotation: If I use the name "Var3" directly, it does work and shows several boxplots:

boxplot(DF$Var3~DF$Var2).

3

3 Answers

6
votes

Try using double brackets instead of single brackets:

tmp1 <- 'Sepal.Width'
tmp2 <- 'Species'
boxplot( iris[[tmp1]] ~ iris[[tmp2]] )
1
votes

You could simply do this. The with statement instructs boxplot to look for variables inside DF, the get statement accesses the object with name tmp.

with(DF, boxplot(get(tmp) ~ Var2))

Here is an illustrative example

tmp <- 'wt'
with(mtcars, boxplot(get(tmp) ~ cyl))

enter image description here

1
votes

You can use paste to construct the formula, and then convert to a formula for the boxplot call:

boxplot(as.formula(paste(temp,"Var2",sep="~")),DF)