I have a large data set, which is read from SPSS
file. It contains several rows and columns, read from many small SPSS
files. The SPSS file contained some mistakes, which I want to correct in R. When the data is read, it and has all noises in factor levels, but data is ok in SPSS. I cannot change factor levels in many individual files in SPSS. Following is the small sample of data that I have
data
a b c d e
[1] 3 5 1 Very dissatisfied 5 5
[2] 8 3 10 Don't Know 1
[3] 7 5 3 8 6
[4] 3 5 9 6 99
[5] 9 4 8 10 Very Satisfied 3
[6] 5 NA 99 Don't Know Very Satisfied 10
levels(data[,1])
[1] "1 Very Dissatisfied" "2" "3" "4"
[5] "5" "6" "7" "8"
[9] "9" "1" "10 Very Satisfied" "99 Don't know"
[12] "1 Very Bad" "99" "2 Satisfied" "10"
The levels contains many mistakes. I want to correct them to something like following
x<-factor()
x<-ordered(x,levels=c("1 Very Dissatisfied","2 Satisfied","3 Satisfied","4 Satisfied",
"5 Satisfied","6 Satisfied","7 Satisfied","8 Satisfied","9 Satisfied","10 Very Satisfied",
"99 Dont Know"))
levels(x)
[1] "1 Very Dissatisfied" "2 Satisfied" "3 Satisfied" "4 Satisfied"
[5] "5 Satisfied" "6 Satisfied" "7 Satisfied" "8 Satisfied"
[9] "9 Satisfied" "10 Very Satisfied" "99 Dont Know"
I tried following code
for(j in c(1,2,5)){
data[,j] <- factor(data[,j], levels = c(levels(data[,j]), levels(x)))
for(i in 2:9){
data[grep(i,data[,j]),j] <- paste(i,"Satisfied")}
}
This does not work. Please show me where I am wrong, and what should I do.
Even after this code works, I have to remove unused garbage factors that the variable contains. How to do it?