Question solved!
Question:
In R, I've been trying to find an elegant way to apply several functions with different arguments to a list containing many tibbles/data.frames, however, I'm struggling to pass through the arguments correctly. I'm attempting to clean and pre-process text data in pharmaceuticals & I've been trying to use modify_if, invoke, map and more. Any help is greatly appreciated.
Note: only starting to learn programming, please forgive the naivety :)
# Set up Example Data
Test_DataFrame <- tibble("Integer_Variable" = c(rep(x = 1:4))
,"Character_Variable" = c("tester to upper"
,"test squishing"
,"canitcomprehend?.,-0(`kljndsfiuhaweraeriou140987645=Error?"
," test white space triming " ))
# With modify_if with a singular function and arguments it works:
# Mofidy character vectors by trimming the left side of the string --= works well
modify_if(.x = Test_DataFrame
,.p = is.character
,.f = str_trim
, side = "left") # Works well
# Expected results
# A tibble: 4 x 2
# Integer_Variable Character_Variable
# <int> <chr>
# 1 1 "tester to upper"
# 2 2 "test squishing"
# 3 3 "canitcomprehend?.,-0(`kljndsfiuhaweraeriou140987645=Error?"
# 4 4 "test white space triming "
####### Note the right hanging whitespace proving the arguments is being applied!
However, when I try doing this with with more than one function with any arguments I hit a wall (function arguments are ignored). I've tried a lot of combinations of modify_if (some below) and other functions such as invoke (buts its being retired), exec with map (which makes no sense to me). So far no success. Any help is grately appreciated.
# does not work
modify_if(.x = Test_DataFrame
,.p = is.character # = the condition to specify which column to apply the functions to
,.f = c( # a pairwise list of "name" = "function to apply" to apply to each column where the condition = TRUE
UpperCase = str_to_upper # Convert strings to upper case
,TrimLeadTailWhiteSpace = str_trim # trim leading and ending whitespace
,ExcessWhiteSpaceRemover = str_squish) # if you find any double or more whitespaces (eg " " or " ") then cut it down to " "
, side = "left" # its ignoring these arguments.
)
# Does not work
modify_if(.x = Test_DataFrame
,.p = is.character
,.f = c(UpperCase = list(str_to_upper) # listed variant doesnt work
,TrimLeadTailWhiteSpace = list(str_trim, side = "left")
,ExcessWhiteSpaceRemover = list(str_squish))
) # returns the integer variable instead of the character so drastically wrong
# Set up Function - Argument Table
Function_ArgumentList <- tibble("upper" = list(str_to_upper)
,"trim" = list(str_trim, side = "left")
,"squish" = list(str_squish))
# Doesnt work
modify_if(.x = Test_DataFrame
,.p = is.character
,.f = Function_ArgumentList)
# Error: Can't convert a `tbl_df/tbl/data.frame` object to function
# Run `rlang::last_error()` to see where the error occurred.
I realise that the functions used in the above examples would be fine to pass through without arguments, but to solve the problem I'm having this is the simplied example of the problem I'm encountering.
Solution:
Thanks to @stefan and @BenNorris for the hel;p below! To @stefan 's solution more clearly, I've slightly modified the answer to;
library(dplyr)
library(purrr)
library(stringr)
Test_DataFrame <- tibble("Integer_Variable" = c(rep(x = 1:4))
,"Character_Variable" = c("tester to upper"
,"test squishing"
,"canitcomprehend?.,-0(`kljndsfiuhaweraeriou140987645=Error?"
," test white space triming " )
)
f_help <- function(x, side = "left") {
str_to_upper(x) %>%
str_trim(side = side) # %>%
# str_squish() # note that this is commented out
}
modify_if(.x = Test_DataFrame
,.p = is.character
,.f = f_help
,side = "left")
# A tibble: 4 x 2
# Integer_Variable Character_Variable
# <int> <chr>
# 1 "TESTER TO UPPER"
# 2 "TEST SQUISHING"
# 3 "CANITCOMPREHEND?.,-0(`KLJNDSFIUHAWERAERIOU140987645=ERROR?"
# 4 "TEST WHITE SPACE TRIMING "
# Note the right sided white space is still preent! It worked!!!
Question solved!
at beginning of post as check mark below confirms resolution. – Parfait