0
votes

Example data:

Project  2016   2017   2018   2019
Proj1      42     36    400    250
Proj2      96    780     60    900
Proj3     180    230      0      0

I have a set of financial data from this year, that comes with previous years' financial data as well. I am trying to mutate the data so that I add the previous three years into a "Previous Funding" column.

The data has columns labeled with 2016, 2017, 2018, 2019... etc

Totals<-Totals %>% mutate("Previous Years"=`2016`+`2017`+`2018`)

Now, I'm actually trying to set this up so that I can select this programmatically; next year, I'd rather look at 2017, 2018, and 2019 data, naturally, and I'd like to just set it up so that I can enter a year number, and it will select the right columns using code.

year = 2019

index<-which(colnames(Totals)==year)

Totals<-Totals%>%
##Here's where it gets hairy
mutate("Previous Years"=Totals[index-3]+Totals[index-2]+Totals[index-1])

Error: Column `Previous Years` is of unsupported class data.frame

So, there are some issues. Obviously, I'm calling something incorrectly, scenario 1 above works like a charm, and the second scenario gives an error. I feel like it has something to do with the back ticks that you would normally use to call columns with unusual names in dplyr.

What is the proper way to do something like this?

2
I'm sure this is solvable with your current data.frame, but the root of the problem is that you have a single variable (funding) spread across multiple columns (one each year). All of the tidyverse works much better if you gather your data.frame. Something like long_Totals <- gather(Totals, Year, Funding, starts_with("20")). - Axeman
You'd be more likely to get useful answers if you provide a reproducible example. - Axeman
I'll try to make an example of a funding table quickly and add it to the body of my question. I believe the data is essentially tidy. - OfTheAzureSky
Try printing Totals[2018] to the console and see what it outputs. It won't give you a vector, which is the source of the error - Calum You

2 Answers

1
votes

I don't think your data is tidy. Things become easier if you make it tidy. For example:

Read in data

library(dplyr)
library(tidyr)

Totals <- data.table::fread('Project  2016   2017   2018   2019
Proj1     $42    $36   $400   $250
Proj2     $96   $780    $60   $900
Proj3    $180   $230     $0     $0', header = TRUE)

year <- 2019

Tidy and calculate previous funding.

summ <- Totals %>% 
  gather(Year, Funding, -Project) %>% 
  mutate(Funding = readr::parse_number(Funding)) %>% 
  group_by(Project) %>% 
  summarise(Previous_funding = sum(Funding[Year %in% (year - 3):(year - 1)]))
# A tibble: 3 x 2
  Project Previous_funding
  <chr>              <dbl>
1 Proj1                478
2 Proj2                936
3 Proj3                410

You could also use mutate instead of summarise to keep all data:

# A tibble: 12 x 4
# Groups:   Project [3]
   Project Year  Funding Previous_funding
   <chr>   <chr>   <dbl>            <dbl>
 1 Proj1   2016       42              478
 2 Proj2   2016       96              936
 3 Proj3   2016      180              410
 4 Proj1   2017       36              478
 5 Proj2   2017      780              936
 6 Proj3   2017      230              410
 7 Proj1   2018      400              478
 8 Proj2   2018       60              936
 9 Proj3   2018        0              410
10 Proj1   2019      250              478
11 Proj2   2019      900              936
12 Proj3   2019        0              410

Or if you want, you can add the previous funding back to your original wide table:

left_join(Totals, summ, 'Project')
 Project 2016 2017 2018 2019 Previous_funding
1   Proj1  $42  $36 $400 $250              478
2   Proj2  $96 $780  $60 $900              936
3   Proj3 $180 $230   $0   $0              410
0
votes

Using your example data,

library(tidyverse)

adf <- tibble(
  project = c("pro1","pro2","pro3"),
  `2016` = c(42,96,180),
  `2017` = c(36,780,230),
  `2018` = c(400,60,0),
  `2019` = c(250,900,0)
)

Now let's write a function that will add the sum given the specified year

previous <- function(data, year){
  data%>%gather(Year, Funding, -project)%>%
    mutate_at(vars(Year), list(~as.numeric(.)))%>%
    split(.$project)%>%
    map(~(.)%>%filter(Year < 
    year)%>%summarise(UQ(paste0("Prior_to_",as.character(year))) := 
    sum(Funding)))%>%bind_rows()%>%
    bind_cols(data,.)
}

Now let's add the new column that has the sum of funding from years prior to 2018.

> previous(data = adf, year = 2018)
# A tibble: 3 x 6
  project `2016` `2017` `2018` `2019` Prior_to_2018
  <chr>    <dbl>  <dbl>  <dbl>  <dbl>         <dbl>
1 pro1        42     36    400    250            78
2 pro2        96    780     60    900           876
3 pro3       180    230      0      0           410
>

You can also use group_by() in the function as follows:

previous <- function(data, year){
  data%>%gather(Year, value, -project)%>%
    mutate_at(vars(Year), list(~parse_number(.)))%>%
    group_by(project)%>%
    summarise(UQ(paste0("Prior_to_",as.character(year))) := 
sum(value[Year < year]))%>%
    left_join(data, ., by = 'project')
}