packageVersion("dplyr")
#[1] ‘0.8.99.9002’
Please note that this question uses dplyr's new across()
function. To install the latest dev version of dplyr issue the remotes::install_github("tidyverse/dplyr")
command. To restore to the released version of dplyr issue the install.packages("dplyr")
command. If you are reading this some point in the future and are already on dplyr 1.X+ you won't need to worry about this note.
library(tidyverse)
df <- tibble(Date = c(rep(as.Date("2020-01-01"), 3),
rep(as.Date("2020-02-01"), 2)),
Type = c("A", "A", "B", "C", "C"),
col1 = 1:5,
col2 = c(0, 8, 0, 3, 0),
col3 = c(25:29),
colX = rep(99, 5))
#> # A tibble: 5 x 6
#> Date Type col1 col2 col3 colX
#> <date> <chr> <int> <dbl> <int> <dbl>
#> 1 2020-01-01 A 1 0 25 99
#> 2 2020-01-01 A 2 8 26 99
#> 3 2020-01-01 B 3 0 27 99
#> 4 2020-02-01 C 4 3 28 99
#> 5 2020-02-01 C 5 0 29 99
I'd like to sum columns 1
through X
above row-wise, grouped by "Date" and "Type". I will always start at the third column (ie col1
), but will never know the numerical value of X
in colX
. That's OK because I can use the length of the data frame to determine how far I need to go 'out' to capture all columns until the end of the data frame. Here's my approach:
df %>%
group_by(Date, Type) %>%
summarize(across(3:length(.)), sum())
#> Error: Problem with `summarise()` input `..1`.
#> x Can't subset columns that don't exist.
#> x Locations 5 and 6 don't exist.
#> i There are only 4 columns.
#> i Input `..1` is `across(3:length(.))`.
#> i The error occured in group 1: Date = 2020-01-01, Type = "A".
#> Run `rlang::last_error()` to see where the error occurred.
But it seems my usage of the base R length(.)
function is improper. Am I using dplyr's new across()
function in the right manner? How can I get the length of the data frame in the portion of the pipe where I need it? I'll never know how many columns there are to the end, nor are the actual names nearly as clean as my example data frame.