I'm facing an issue in R which I have described below. I need the count and average of all columns (avoiding NA's) per group ID which satisfy a condition.
I have the dataset below and in it there is an additional column. "T"
structure(list(ID = c(1L, 1L, 1L, 2L, 3L, 3L), S1 = c(NA, 5L,
1L, 2L, 4L, 2L), S2 = c(1L, 2L, 3L, 7L, NA, 11L), T = c(3L, 3L,
3L, 5L, 2L, 2L)), class = "data.frame", row.names = c("1", "2",
"3", "4", "5", "6"))
Is it possible to get the count and average of each column only when the column fulfills the condition (column value <= T value); (In the original dataset, there are more than just S1 and S2..)
The resulting dataframe should look like:
ID S1.count S2.count S1.overall S2.overall S1.per S2.per
1 1 1 3 2 3 0.5 1.0
2 2 1 0 1 1 1.0 0.0
3 3 1 0 2 1 0.5 0.0
S1.count & S2.count : Number of observations which are less than T respectively.
S1.overall & S2.overall: Total number of observations (avoiding NA's) in each column.
S1.per & S2.per: S1.count/S1.overall , S2.count/S2.overall.