Summarise values from different columns and rows

Question

Is there a built-in way to calculate sums over different rows and columns? I know that I could form a new dataframe from id, drug, day2, sum_d2, rename the last two columns, delete these columns in the "old" dataframe, perform rbind with the "old" dataframe and summarise by group. But this seems strangely complicated and possibly error-prone.

How do I calculate the sum of sum_1 and sum_2 of drug_a given on 2020-01-02 using id + drugname as grouping variables + day1 + day2 (when these two are the same)?

The reason for this format is that I have to split the dosage of a continous infusion at midnight...

Example data:

id <- c(rep(1,2))
drug <- c(rep("Drug_a",2))                     
day1 <- c(rep("2020-01-01",1),rep("2020-01-02",1))
sum_1 <- c(rep(250,1),rep(550,1))
day2 <- c(rep("2020-01-02",1),rep("2020-01-03",1))
sum_2 <- c(rep(100,1),rep(75,1))

example_data <- data.frame(id,drug,day1,sum_1,day2,sum_2)

  id   drug       day1 sum_1       day2 sum_2
1  1 Drug_a 2020-01-01   250 2020-01-02   100
2  1 Drug_a 2020-01-02   550 2020-01-03    75

Expected output within these lines:

  id   drug       day    sum
1  1 Drug_a 2020-01-01   250 
2  1 Drug_a 2020-01-02   650 
3  1 Drug_a 2020-01-03    75

Can you show your expected output? edit the question and add it to the body of your post. Cheers. — M--

Ben Ben · Accepted Answer · 2020-01-24T16:34:12

Perhaps something like this might work. You can use pivot_longer to put day and sum in single columns (i.e., combine day_1 and day_2 into day, sum_1 and sum_2 into sum).

library(tidyverse)

example_data %>%
  pivot_longer(cols = c(-id, -drug), names_to = c(".value", "group"), names_sep = "_") %>%
  group_by(id, drug, day) %>%
  summarise (total = sum(sum))

# A tibble: 3 x 4
# Groups:   id, drug [1]
     id drug   day        total
  <dbl> <fct>  <fct>      <dbl>
1     1 Drug_a 2020-01-01   250
2     1 Drug_a 2020-01-02   650
3     1 Drug_a 2020-01-03    75

Data

id <- c(rep(1,2))
drug <- c(rep("Drug_a",2))                     
day_1 <- c(rep("2020-01-01",1),rep("2020-01-02",1))
sum_1 <- c(rep(250,1),rep(550,1))
day_2 <- c(rep("2020-01-02",1),rep("2020-01-03",1))
sum_2 <- c(rep(100,1),rep(75,1))

example_data <- data.frame(id,drug,day_1,sum_1,day_2,sum_2)

Summarise values from different columns and rows

2 Answers

data