I am trying to convert factor variables into numeric. I have tried both these solutions -
as.numeric(levels(f))[f]
as.numeric(as.character(f))
But the issue persists. Warning Message - NAs introduced by coercion
Reproducible example -
df = data.frame(x = c("10: Already Delinquent 90+",
"11: Credit History <6 Months",
"12: Current Balance = 0",
"13: Balance (2-6)=0",
"20: 1+ x 90+",
"30: 3+ x 60-89",
"31: 2 x 60-89",
"32: 1 x 60-89",
"40: 3+ x 30-59",
"41: 2 x 30-59",
"42: 1 x 30-59",
"50: Insufficient Performance",
"60: 3+ x 1-29",
"61: 2 x 1-29",
"62: 1 x 1-29",
"70: Never delinquent"),
y = c("00:Bad",
"01:Ind",
"02:Good",
"NA",
"00:Bad",
"01:Ind",
"02:Good",
"NA",
"00:Bad",
"01:Ind",
"02:Good",
"NA",
"00:Bad",
"01:Ind",
"02:Good",
"NA"),
z = ceiling(rnorm(16)))
#Select all the factor variables
factorvars = colnames(df)[which(sapply(df,is.factor))]
#Concatenate with "_Num"
xxx <- paste(factorvars, "_Num", sep="")
#Converting Factor to Numeric
for (i in 1:length(factorvars))
df[,xxx[i]] = NA
df[,xxx[i]] = as.numeric(levels(df[,factorvars[i]]) [df[,factorvars[i]]])
I want to retain factor variables and create new variables with conversion of levels to numeric. The desired output looks like below -
x y x_num y_num
10: Already Delinquent 90+ 00:Bad 1 1
11: Credit History <6 Months 01:Ind 2 2
12: Current Balance = 0 02:Good 3 3
13: Balance (2-6)=0 NA 4 NA
20: 1+ x 90+ 00:Bad 5 1
30: 3+ x 60-89 01:Ind 6 2
31: 2 x 60-89 02:Good 7 3
32: 1 x 60-89 NA 8 NA
40: 3+ x 30-59 00:Bad 9 1
41: 2 x 30-59 01:Ind 10 2
42: 1 x 30-59 02:Good 11 3
50: Insufficient Performance NA 12 NA
60: 3+ x 1-29 00:Bad 13 1
61: 2 x 1-29 01:Ind 14 2
62: 1 x 1-29 02:Good 15 3
70: Never delinquent NA 16 NA
"01:Ind"
and expect it to convert to1
– Rich Scriven