2
votes

I have a df with multiple columns like in the example bellow. I want to change all zeros by the number two in the columns from A1 to A5, but I do not want to write all columns names in the mutate function. Does anyone know how I can create a loop that goes from A1 to A5 and change the zeros by two with a mutate function?

df = data.frame(A1 = c(0,1,1,0,0,1,1,1), B1 = c(0,1,1,0,0,0,0,0), C1 = c(1,1,1,0,0,0,0,0), A2 = c(0,1,1,0,0,0,0,0), A3 = c(1,1,1,0,1,1,1,1), A4 = c(1,1,1,0,0,1,1,1), A5 = c(0,1,1,0,0,1,1,1), C2 = c(1,1,1,0,0,1,0,0))

I tried to do that with the following loop

for (i in 1:5) {
   a = paste0('A', i)
  df = df %>% mutate(a = ifelse( a == 0, 2, 1))
}

...but the mutate function does not acccept the variable.

3

3 Answers

5
votes

It can be done without any loop. Create a numeric index or column name vector ('nm1') of the columns to be changed, subset the dataset while creating a logical matrix on the subset of the dataset and assign it to 2

nm1 <- paste0("A", 1:5)
#Or use `startsWith`
#nm1 <- startsWith(names(df), "A")
df[nm1][!df[nm1]] <- 2
df
#  A1 B1 C1 A2 A3 A4 A5 C2
#1  2  0  1  2  1  1  2  1
#2  1  1  1  1  1  1  1  1
#3  1  1  1  1  1  1  1  1
#4  2  0  0  2  2  2  2  0
#5  2  0  0  2  1  2  2  0
#6  1  0  0  2  1  1  1  1
#7  1  0  0  2  1  1  1  0
#8  1  0  0  2  1  1  1  0

Or it can also be updated as

df[nm1] <-  (!df[nm1]) + 1

Or with replace

cbind(df[setdiff(names(df), nm1)], replace(df[nm1], !df[nm1], 2))

With dplyr, for multiple columns, we can use mutate_all (for all the columns) and mutate_at (selected columns)

library(dplyr)
df %>%
    mutate_at(vars(nm1), ~ replace(., .== 0, 2))

Or we can use a loop (as it seems the OP is interested only in loops), where we use :=, evaluating the 'a' on it 'lhs' while converting the 'a' value to symbol, do the evaluation (!!) check if it is equal to 0, then return 2 or else 1

for (i in 1:5) {
    a <- paste0('A', i)
    df <- df %>%
               mutate(!!a := ifelse( !!rlang::sym(a) == 0, 2, 1))
  }

NOTE: paste is vectorized, so we don't need to create the 'a' inside the loop. It can

a <- paste0("A", 1:5)
for(nm in a) {
  df <- df %>%
          mutate(!! nm := ifelse(!! rlang::sym(nm) == 0, 2, 1))
 }

Or another option is data.table

library(data.table)
setDT(df)[, (nm1) := replace(.SD, .SD == 0, 2), .SDcols = nm1]

Or with set

setDT(df)
for(j in nm1) set(df, i = which(df[[j]] == 0), j = j, value = 2)
2
votes

Alternatively, using apply function, you can do:

apply(df,2,function(x) {ifelse(x==0,2,x)})

     A1 B1 C1 A2 A3 A4 A5 C2
[1,]  2  2  1  2  1  1  2  1
[2,]  1  1  1  1  1  1  1  1
[3,]  1  1  1  1  1  1  1  1
[4,]  2  2  2  2  2  2  2  2
[5,]  2  2  2  2  1  2  2  2
[6,]  1  2  2  2  1  1  1  1
[7,]  1  2  2  2  1  1  1  2
[8,]  1  2  2  2  1  1  1  2

EDIT mutate only columns A1 to A5

df[,paste0("A",1:5)] <- apply(df[,paste0("A",1:5)],2,function(x) {ifelse(x==0,2,x)})

  A1 B1 C1 A2 A3 A4 A5 C2
1  2  0  1  2  1  1  2  1
2  1  1  1  1  1  1  1  1
3  1  1  1  1  1  1  1  1
4  2  0  0  2  2  2  2  0
5  2  0  0  2  1  2  2  0
6  1  0  0  2  1  1  1  1
7  1  0  0  2  1  1  1  0
8  1  0  0  2  1  1  1  0
0
votes

You can try the following base R code, using grepl() and &

df[df==0 & t(replicate(nrow(df),grepl("A",names(df))))]<- 2

or

df[df==0 & !!outer(rep(1,nrow(df)),grepl("A",names(df)))]<- 2

such that

> df
  A1 B1 C1 A2 A3 A4 A5 C2
1  2  0  1  2  1  1  2  1
2  1  1  1  1  1  1  1  1
3  1  1  1  1  1  1  1  1
4  2  0  0  2  2  2  2  0
5  2  0  0  2  1  2  2  0
6  1  0  0  2  1  1  1  1
7  1  0  0  2  1  1  1  0
8  1  0  0  2  1  1  1  0