Upon loading data, R converts character strings as Factors unless told so otherwise. We then have to convert Factors into character or numeric based on the underlying data. In the case of numeric values, we first convert to character string using as.character() and then convert the result to as.integer() in the case of integer values.
But upon cleaning up characters from a number using gsub, R automatically is converting the cleaned up strings into characters.
For ex:
> sal <- data.frame(name = c('abc','def','ghi','pqr'),
+ Salary = c('$65,000','$102,000','$85,000','$72,000'))
> str(sal)
'data.frame': 4 obs. of 2 variables:
$ name : Factor w/ 4 levels "abc","def","ghi",..: 1 2 3 4
$ Salary: Factor w/ 4 levels "$102,000","$65,000",..: 2 1 4 3
> sal$Salary <- gsub('\\$','',sal$Salary)
> sal$Salary <- gsub(',','',sal$Salary)
> str(sal)
'data.frame': 4 obs. of 2 variables:
$ name : Factor w/ 4 levels "abc","def","ghi",..: 1 2 3 4
$ Salary: chr "65000" "102000" "85000" "72000"
>
We can see the 'Salary' column changes from Factor to Character after gsub. Could someone let me know if gsub also performs as.character() operation here? If so, will it not convert the column to integers as all values are integers?