I need to group a data.table by 30 columns.
Rather than list each column name, can I use a regular expression like grepl for the by argument?
Below is a minimum reproducible example
library(dplyr)
library(ggplot2)
library(data.table)
data <- diamonds %>% cbind("Another Column" = diamonds$cut == "Ideal") %>% data.table
data[, lapply(.SD, sum, na.rm = TRUE), by = list(carat, cut, color, clarity, depth, table, price, `Another Column`)]
I have already tried the following
data[, lapply(.SD, sum, na.rm = TRUE), by = colnames(data) %>% .[!grepl(pattern = "^[a-z]$", .)] %>% as.list]
Update:
- please use pattern ^[a-z]$
- sum column x, y, z by carat, cut, color, clarity, depth, table, price, Another Column
data[, lapply(.SD, sum, na.rm = TRUE), .SDcols = grep(pattern = "^[a-z]$", names(data), value = TRUE)]? - Ronak ShahAnother Column- Kevin Ho