I am trying achieve the following: I have a dataset, and a function that subsets this dataset and then performs a series of operations on the subset. Subsetting happens based on row names. I am able to do it step by step (i.e. running this function for each subset separately), but I have a list of desired subsets, and I would like to loop over this list. It sounds complicated - please check the example below. This is what I can do:
#dataframe with rownames
whole_dataset <- data.frame(wt1 = c(1, 2, 3, 6, 6),
wt2 = c(2, 3, 4, 4, 2))
row.names(whole_dataset) = c("HTA1", "HTA2", "HTB2", "CSE1", "CSE2")
# two different non-overlapping subsets
his <- c("HTA1", "HTA2", "HTB2")
cse <- c("CSE1", "CSE2")
#this is the function I have
fav_complex <- function (data, complex) {
small_data<- data[complex,] #subset only the rows that you need
sum.all<-colSums(small_data) #calculate sum of columns
return(sum.all)
}
#I generate two deparate named vectors
his_data <- fav_complex(data = whole_dataset, complex = his)
cse_data <- fav_complex(data = whole_dataset, complex = cse)
#and merge them
merged_data<- rbind(his_data,cse_data)
it looks like this
> merged_data
wt1 wt2
his_data 6 9
cse_data 12 6
I would like to somehow generate the merged_data dataframe without having to call the 'fav_complex' function multiple times. In real life I have about 20 subsets, and it is a lot of code. This is my solution that doesn't work
#I first have a character vector listing all the variable names
subset_list <- c("his", "cse")
#then create a loop that goes over this list
#make an empty dataframe
merged_data2 <- data.frame()
#fill it with a for loop output
for (element in subset_list) {
result <- fav_complex(data = whole_dataset, element)
merged_data2 <-rbind(merged_data2, result)
}
I know this is wrong. In this loop, 'element' is just a string, rather than a variable with stuff in it. But I don't know how to make it a variable. noquote(element) didn't work. I tried reading about non standard evaluation and eval(), substitute(), but it is too abstract for me - I think I am not there yet with my R expertise.
data
notwhole_dataset
. 2) In the loop useresult <- fav_complex(data = whole_dataset, get(element))
- Rui Barradassplit
,lapply
,do.call(rbind)
, or if you don't mind extra dependencies usingpurrr
or similar. (Or, more simply,dplyr
/data.table
grouped operations if the operations really are as simple as "sum all columns") - Gregor Thomasmerged_data
, in that it lackscolnames
andrownames
. Would you have any suggestion how to introduce them? I would also be grateful if you could tell me why you don't think usingget
is a good idea. @RuiBarradas, thank you, I have corrected the error. This solution also produces a dataframe without row names and column names. @Gregor, this is a very simplified example and I find this weird way more convenient, but I might try to re-write it if necessary! - Wera