4
votes

As a big fan of dplyr and its tidy data concept, I would like to mutate a specific variable whenever it exists in a dataframe. This is the idea:

# Load libraries
library(dplyr)

# Create data frames
df1 <- data.frame(year = 2000:2010, foo = 0:10)
df2 <- data.frame(year = 2000:2010)

# Create function
cnd_mtt <- function(df){
  df %>%
    mutate_if(colname == "foo", as.factor) # <---- this is the tricky part
}

Expected result: the function should work for both data frames and without error

Ideas?

3
what's colname?MKR
How about an if statement: if("foo" in names(df)) ...Gregor Thomas
@Gregor must be %in%, see my answer3pitt

3 Answers

11
votes

You can use mutate_at with one_of which raises a warning message if the column doesn't exist:

cnd_mtt <- function(df){
    df %>%
        mutate_at(vars(one_of('foo')), as.factor)
}

cnd_mtt(df2)
#   year
#1  2000
#2  2001
#3  2002
#4  2003
#5  2004
#6  2005
#7  2006
#8  2007
#9  2008
#10 2009
#11 2010
Warning message:
Unknown variables: `foo`

Just to clarify, the warning message is raised by one_of when it fails to resolve the column name from the vars variable:

one_of('foo', vars = names(df1))
# [1] 2
one_of('foo', vars = names(df2))
# integer(0)
Warning message:
Unknown variables: `foo`

In case you want to further get rid of the warning message, take @Gregor's comment, you can use mutate_at with if/else, and returns integer(0) if foo doesn't exist in the columns:

df2 %>% 
    mutate_at(if('foo' %in% names(.)) 'foo' else integer(0), as.factor)

#   year
#1  2000
#2  2001
#3  2002
#4  2003
#5  2004
#6  2005
#7  2006
#8  2007
#9  2008
#10 2009
#11 2010
1
votes

Building on Psidom answer, you can also use quietly to avoid the warning:

df2 %>%
  mutate_at(vars(quietly(one_of)("foo","boo",  .vars = tidyselect::peek_vars())$result),
            as.factor)
0
votes

Use a basic pipe operation, untethered to dplyr I believe. Also try not to use df as variable name

# Load libraries
library(dplyr)

# Create data frames
df1 <- data.frame(year = 2000:2010, foo = 0:10)
df2 <- data.frame(year = 2000:2010)

# Create function
cnd_mtt <- function(dff,colname){
    if (colname %in% names(dff)){
        dff%>%mutate(new_col=some.transformation)
    }
}