0
votes

Problem context: replace a chr variable in data frame by converting chr to numeric. Use a function for the chr to numeric, then apply() on passing in df_credit_status$length_of_service to be converted to numeric and to update that same data frame column, e.g., df_credit_status$length_of_service.

I have this function defined and the apply() function.

The data frame column field: df_credit_status$length_of_service is chr, for this update, I will need to possibly create a new field and then move / copy that result to df_credit_status$length_of_service, after changing the data type.

fun_year <- function(length) { as.numeric(gsub("[A-Za-z]|[[:punct:]]|\\s+", "", length))}

df_credit_status$length <- apply(df_credit_status$length_of_service, FUN=fun_year)

The error in apply() is dim(x) must have a positive length.

dput() has this data from df_credit_status$length_of_service

"< 1 year", "2 years", "3 years", "4 years", "5 years", "6 years", "7 years", "8 years", "9 years", "10+ years"
2
sapply is probably what you wanted. apply is for objects that have a dimension (like a matrix) dim(x) method. A vector doesn't have that - dario
sapply() worked, thanks, I forgot about that. - user1857373

2 Answers

1
votes

Why do you need an apply function at all? You can just use your function directly:

df_credit_status <- data.frame(length_of_service = c("1 year", "2 years", "3 years", "4 years", "5 years", "6 years", "7 years", "8 years", "9 years", "10+ years"))

fun_year <- function(length) {as.numeric(gsub("[A-Za-z]|[[:punct:]]|\\s+", "", length))}

df_credit_status$length <- fun_year(df_credit_status$length_of_service)

df_credit_status
   length_of_service length
1             1 year      1
2            2 years      2
3            3 years      3
4            4 years      4
5            5 years      5
6            6 years      6
7            7 years      7
8            8 years      8
9            9 years      9
10         10+ years     10
1
votes

Since the function you created works perfectly fine with vectors, we don't need to use sapply:

df_credit_status$length <- fun_year(df_credit_status$lenght_of_service)

Returns:

   lenght_of_service length
1           < 1 year      1
2            2 years      2
3            3 years      3
4            4 years      4
5            5 years      5
6            6 years      6
7            7 years      7
8            8 years      8
9            9 years      9
10         10+ years     10