1
votes

I have a small data frame:

ga_sessions_combined <- structure(list(Metric = c("Users", "Bounces", "Transactions", 
"Revenue"), ym_201904 = c(5813, 2124, 193, 24923.155206), ym_201905 = c(7094, 
2445, 257, 40361.46), ym_201906 = c(5647, 2049, 181, 26314.32
), ym_201907 = c(5553, 1864, 144, 23678.14), ym_201908 = c(5392, 
1811, 116, 16893.6), ym_201909 = c(5160, 1933, 121, 23635.754179
), ym_201910 = c(7504, 2381, 187, 34052.48), ym_201911 = c(23808, 
4661, 1421, 285310.131623), ym_201912 = c(33244, 8332, 3259, 
596214.021099), ym_202001 = c(8460, 2987, 233, 38663.293857), 
    ym_202002 = c(7463, 1510, 212, 37170.53), ym_202003 = c(9069, 
    2103, 295, 47716.45)), .Names = c("Metric", "ym_201904", 
"ym_201905", "ym_201906", "ym_201907", "ym_201908", "ym_201909", 
"ym_201910", "ym_201911", "ym_201912", "ym_202001", "ym_202002", 
"ym_202003"), row.names = c(NA, -4L), class = c("tbl_df", "tbl", 
"data.frame"))

Looks like this:

ga_sessions_combined
# A tibble: 4 x 13
  Metric ym_201904 ym_201905 ym_201906 ym_201907 ym_201908 ym_201909 ym_201910 ym_201911 ym_201912 ym_202001 ym_202002
  <chr>      <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>     <dbl>
1 Users      5813      7094      5647      5553      5392      5160      7504     23808     33244      8460      7463 
2 Bounc…     2124      2445      2049      1864      1811      1933      2381      4661      8332      2987      1510 
3 Trans…      193       257       181       144       116       121       187      1421      3259       233       212 
4 Reven…    24923.    40361.    26314.    23678.    16894.    23636.    34052.   285310.   596214.    38663.    37171.

I would like to add a new row 'ConversionRate' where the column Metric would say 'ConversionRate' and the value would be Transactions (rows 3) / Users (Row 1).

Almost like dplyr mutate but across rows instead of columns. How can I do this?

2

2 Answers

2
votes

Here is an option, where we filter the rows where the 'Metric' is 'Transactions' 'or 'Users', then use summarise_all to divide the rows, and bind that with the original dataset after creating the 'Metric' as 'ConversionRate'

library(dplyr)
ga_sessions_combined %>% 
    filter(Metric %in% c("Transactions", "Users")) %>%
    select(-1) %>%
    summarise_all(~ last(.)/first(.)) %>% 
    bind_cols(Metric = 'ConversionRate', .) %>% 
    bind_rows(ga_sessions_combined, .)

Or another option is add_row from tibble, create a new row with 'Metric' as 'ConversionRate', and replace the NA in the numeric columns in the last row by dividing the specific rows

library(tibble)
ga_sessions_combined %>% 
    add_row(Metric = 'ConversionRate') %>%
    mutate_if(is.numeric, 
       ~ replace(., n(), .[Metric == 'Transactions']/.[Metric == 'Users']))
1
votes

In base R, we can use rbind/cbind like :

rbind(ga_sessions_combined, 
  cbind(Metric = 'ConversionRate', 
        subset(ga_sessions_combined, Metric == 'Transactions', select = -1)/
         subset(ga_sessions_combined, Metric == 'Users', select = -1)))


#          Metric ym_201904 ym_201905 ym_201906 ym_201907 ym_201908 ym_201909....
#1          Users  5813.000  7094.000  5647.000  5553.000  5392.000  5160.000....
#2        Bounces  2124.000  2445.000  2049.000  1864.000  1811.000  1933.000....
#3   Transactions   193.000   257.000   181.000   144.000   116.000   121.000....
#4        Revenue 24923.155 40361.460 26314.320 23678.140 16893.600 23635.754....
#5 ConversionRate     0.033     0.036     0.032     0.026     0.022     0.023....