I know that this is a complex title for this post. But I haven't found my exact situation in on-line examples.
I have a named (non-anonymous) function which takes a row of a tibble, a string (structure), and a numeric (percent) as input and performs linear interpolation, iterating along a subset of the values in the row. (NOT a column-wise operation.) It performs linear interpolation. the "math" of it involves use of the values in the cells as well as numbers extracted from the names of the columns. The columns have names like GTV0, GTV1, … GTV135.
The working code for this follows. I reproduce it here for completeness, though the specifics aren't necessarily germane to the question below.
# This function works if fed one row of a df at a time, but isn't "multi-dimensional":
Dx <- function(df, structure, percent) {
# First, make sure we've got our data in the right formats:
df <- df %>% tibble() %>% select(starts_with(structure)) %>% rowwise()
structure <- toString(structure)
percent <- as.double(percent)
# If we don't have any DVH data for the structure, return "NA"
if(is.na(df[[9]])) return(NA)
for(i in 9:(length(df) - 1)) { # The V0 is the 9th entry in the array, so start iterating there.
# Deal with pesky NA's as iterating along (convert to 0's):
if(is.na(df[[i]])) df[[i]] <- 0
if(is.na(df[[i+1]])) df[[i+1]] <- 0
# Typically unlikely for the cell's value to be a round percent, but:
if(df[[i]] == percent) {
answer <- colnames(df[i])
return(as.double(str_replace(answer, paste0(structure, "V"), "")))
} else if(df[[i]] > percent & df[[i+1]] < percent) { # This is why we stop at "length - 1" of data frame.
# Do the linear interpolation here
# First, capture the names of the two columns:
column1 <- colnames(df[i])
column2 <- colnames(df[i+1])
# Strip the structure names from the column names and convert to doubles:
column1 <- as.double(str_replace(column1, paste0(structure, "V"), ""))
column2 <- as.double(str_replace(column2, paste0(structure, "V"), ""))
# Perform the linear interpolation:
return(as.double(column1 + ((percent - df[[i]])/(df[[i+1]] - df[[i]]) * (column2 - column1))))
}
}
}
My question is: How do I purrr-ify this? Ideally, I would use this with mutate to create a new column and place the interpolated values into it, row by row. My question is two-part:
- How do I call the named function Dx?
- How do I have to modify the guts of the function (if at all) to work with purrr?
I thought it would be something like:
df <- df %>% rowwise() %>% mutate(GTVD95 = pmap_dfr(df, Dx, "GTV", 95))
But that isn't right.
I can call this existing function with a for loop:
for (i in 1:nrow(df)) {
df$GTVD95[i] <- Dx(df[i,], "GTV", 95)
}
But that's not ideal, because I'd like to put even that into a loop, because I want to find ~20 interpolated points and don't want to call this 20 times, changing the number (eg, the two 95's in the above loop) each time.
I appreciate any insight! Thanks in advance!
dput(df[1:10,1:10])
. Please also provide the expected output of this example data so we can check our solution. See How to make a great R reproducible example for more. – Ian Campbellmutate(GTVD95 = Dx(cur_data(), "GTV", 95))
will work – Abdessabour Mtk