0
votes

I'm trying to parse the cases (rows) of a data.frame with dplyr, but to no avail. I created two functions for this:

f1 <- function(x) {
  c(s = sum(x), 
    m = mean(x), 
    v = var(x))
}

f2 <- function(x) {
  apply(x, 1, f1)
}

My data.frame (data_1):

for (i in 1:6) {
  assign(paste('var', i, sep = '_'), 
runif(30, 20, 100))
}

data_1 <- do.call(
  cbind.data.frame, 
  mget(ls(pattern = '*v'))
)

Using dplyr functions:

library(dplyr)

data_1 %>%
  mutate_at(.vars = vars (starts_with('v')),
            .funs = funs(.= f2))

data_1 %>%
  mutate_if(is.numeric, .funs = funs(.= f2))

Error in mutate_impl(.data, dots) : Evaluation error: dim(X) must have a positive length.

Since the analysis is done in the rows, and I have three functions (sum, mean, and variance), the expected return is three columns.

1
apply(x, 1, f1) doesn't make much sense if x has less than 2 dimensions. dplyr::mutate_* applies the functions column by column, not row by row.Rui Barradas
How to solve this?neves
Ok mobile so I cannot check the documentation, but would rowwise() solve this? Alternatively, you could transpose data1Andrew
How does a solution work with rowwise?neves

1 Answers

1
votes

In fact, although not deprecated, rowwise() does not play well with other grouping and summary functions, so is best avoided in dplyr. A useful alternative can be to group by row number. Here is a solution to the above using this approach.

colNames <- syms(paste0("var_", 1:6))
data_1 %>%
   group_by (row_number()) %>%
   summarize(dataMean = mean(!!!colNames),
             dataSum = sum(!!!colNames))