1
votes

I have a matrix (all_parameters) with 150 columns and 100.000 rows. The value for each data element in this matrix is "1". Now I would like to replace the values with "0" if the following conditions are true:

col name either 10, 14, 27 row name starts with "T1_"

I have the following loop which works fine:

T1_missing = c(10,14,27)

for(i in 1:ncol(all_parameters)) {
  if (any(T1_missing == as.integer(colnames(all_parameters)[i]))) { 
    for(j in 1:nrow(all_parameters)) {
      if(grepl("^T1_", rownames(all_parameters)[j])) {
        all_parameters[j,i] <- 0
      }
    }
  }
}

The problem is that the execution of the loops takes an extraordinary long time. I already tried to use the apply function however I was not able to make it work. Can anybody please show me how this could be solved using an apply function (or anything else that is superior and faster over a for-loop).

Thanks for your help!

2
I'm not sure if I didn't understand correctly, but why exactly are you iterating through all the columns? Surely, you can select the three columns in question and then check the rows for those three?ytk
Yes, I could indeed check only for those three rows. However I also would like to execute this code with other conditions where I need to check on other rows as well (and then usually on more than only three rows).user86533

2 Answers

1
votes

You can do it without apply:

df <- data.frame(`10` = rnorm(3), `7` = head(letters,3), `14` = rnorm(3), 
                 check.names = FALSE, row.names = c('T1_A', 'ABC', 'T1_B'))

##             10 7          14
##T1_A -1.8234804 a  1.31575373
##ABC  -0.4232959 b  0.01400561
##T1_B -1.1252495 c -0.32442049

rows.to.change <- grepl('T1_', rownames(df))

cols.missing <- c(10, 14, 27)
cols.to.change <- as.integer(colnames(df)) %in% cols.missing

df[rows.to.change, cols.to.change] <- 0
##             10 7         14
##T1_A  0.0000000 a 0.00000000
##ABC  -0.4232959 b 0.01400561
##T1_B  0.0000000 c 0.00000000
1
votes

Would this just be a simple vectorized operation:

all_parameters[ grepl("^T1_", rownames(all_parameters) ), T1_missing] <- 0 

Further conditions on the rows would be added with either & (which might make the conditions more restrictive) or | (to include more rows). I assumed that your use of the term "col names" actually meant to refer to these by their numeric position. Using [ i, j]-operations, you can mix logical indices in i-rows with numeric indices in j-columns. (Now tested with the example by jbaums: which since he deleted it is now reproduced here:

m <- matrix(1, ncol=30, nrow=12, 
        dimnames=list(paste0('T', rep(1:3, each=4), '_', rep(1:4, 3)),
                      1:30))