0
votes

The answer for this previous question: extrapolate in R for a time-series data does not work for me due to R versions.

I have a dataframe NEI_othertier1_long that looks similar to like this:

state    pollutant    Sector       Fuel      description    year     value
AK       Ammonia      Refining     Diesel     industrial    2008      1.18
AK       Ammonia      Refining     Diesel     industrial    2009      NA
AK       Ammonia      Refining     Diesel     industrial    2010      NA
AK       Ammonia      Refining     Diesel     industrial    2011      5.76
AK       Ammonia      Refining     Diesel     industrial    2012      NA
AK       Ammonia      Refining     Diesel     industrial    2013      NA
AK       Ammonia      Refining     Diesel     industrial    2014      5.83
AK       Ammonia      Refining     Diesel     industrial    2015      NA
AK       Ammonia      Refining     Diesel     industrial    2016      NA
AK       Ammonia      Refining     Diesel     industrial    2017      8.96
AK       Ammonia      Refining     Diesel     industrial    2018      NA
AK       Ammonia      Refining     Diesel     industrial    2019      NA

I have values for 2008, 2011, 2014, and 2017. I have been able to successfully linearly interpolate 2009-2016 using this code:

    NEI_othertier1_long %>%
  dplyr::mutate( value = na.approx(value, na.rm = FALSE, rule = 2) ) -> NEI_othertier1_interpolated

But the interpolation carries the 2017 value forward for 2018 and 2019. I am wanting to linearly extrapolate the 2018 and 2019 values from the years prior.

I have R version 3.5.2 (and can't update), so cannot install latticeExtra, which Hmisc is dependent on to use the approxExtrap function.

Any and all help is appreciated!

dput(head(NEI_othertier1_long)) structure(list(state = c("AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK", "AK"), pollutant = c("Ammonia", "Ammonia", "Ammonia", "Ammonia", "Ammonia", "Ammonia", "Ammonia", "Ammonia", "Ammonia", "Ammonia", "Ammonia", "Ammonia"), CEDS_Sector = c("1A1b_Pet-refining", "1A1b_Pet-refining", "1A1b_Pet-refining", "1A1b_Pet-refining", "1A1b_Pet-refining", "1A1b_Pet-refining", "1A1b_Pet-refining", "1A1b_Pet-refining", "1A1b_Pet-refining", "1A1b_Pet-refining", "1A1b_Pet-refining", "1A1b_Pet-refining"), CEDS_Fuel = c("diesel_oil", "diesel_oil", "diesel_oil", "diesel_oil", "diesel_oil", "diesel_oil", "diesel_oil", "diesel_oil", "diesel_oil", "diesel_oil", "diesel_oil", "diesel_oil" ), tier1_description = c("FUEL COMB. INDUSTRIAL", "FUEL COMB. INDUSTRIAL", "FUEL COMB. INDUSTRIAL", "FUEL COMB. INDUSTRIAL", "FUEL COMB. INDUSTRIAL", "FUEL COMB. INDUSTRIAL", "FUEL COMB. INDUSTRIAL", "FUEL COMB. INDUSTRIAL", "FUEL COMB. INDUSTRIAL", "FUEL COMB. INDUSTRIAL", "FUEL COMB. INDUSTRIAL", "FUEL COMB. INDUSTRIAL"), unit = c("TON", "TON", "TON", "TON", "TON", "TON", "TON", "TON", "TON", "TON", "TON", "TON"), year = 2008:2019, emissions = c(1.18, NA, NA, 5.76, NA, NA, 5.83, NA, NA, 8.96, NA, NA)), row.names = c(NA, -12L), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), groups = structure(list(state = "AK", pollutant = "Ammonia", CEDS_Sector = "1A1b_Pet-refining", CEDS_Fuel = "diesel_oil", tier1_description = "FUEL COMB. INDUSTRIAL", unit = "TON", .rows = list(1:12)), row.names = c(NA, -1L), class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE))

1

1 Answers

1
votes

approxExtrap is just a wrapper around approx so you can just copy the function definition and use it.

NEI_othertier1_long %>% dplyr::mutate(x = approxExtrap(year, value, year, na.rm = TRUE)$y)

Here's approxExtrap if you can't find it:

approxExtrap <- function (x, y, xout, method = "linear", n = 50, rule = 2, f = 0, 
  ties = "ordered", na.rm = FALSE) 
{
  if (is.list(x)) {
    y <- x[[2]]
    x <- x[[1]]
  }
  if (na.rm) {
    d <- !is.na(x + y)
    x <- x[d]
    y <- y[d]
  }
  d <- !duplicated(x)
  x <- x[d]
  y <- y[d]
  d <- order(x)
  x <- x[d]
  y <- y[d]
  w <- approx(x, y, xout = xout, method = method, n = n, rule = 2, 
    f = f, ties = ties)$y
  r <- range(x)
  d <- xout < r[1]
  if (any(is.na(d))) 
    stop("NAs not allowed in xout")
  if (any(d)) 
    w[d] <- (y[2] - y[1])/(x[2] - x[1]) * (xout[d] - x[1]) + 
    y[1]
  d <- xout > r[2]
  n <- length(y)
  if (any(d)) 
    w[d] <- (y[n] - y[n - 1])/(x[n] - x[n - 1]) * (xout[d] - 
        x[n - 1]) + y[n - 1]
  list(x = xout, y = w)
}