I have a data frame with about 15 variables. I have to remove outliers from the variables.
Following a tutorial on web, I am using boxplotting method to remove outliers. I am using a stacked kind of way to remove outliers one by one from the data frame till all data is treated.
Here is my code. My question is, is it a good way to remove outliers or how to improve the code.
#removong outliers from the columns
outliers <- boxplot(outlier_H_rem$var1, plot=FALSE)$out
if(length(outliers) == 0){ outlier_H_rem1<-outlier_H_rem
boxplot(outlier_H_rem1$var1)} else {
outlier_H_rem1<-outlier_H_rem[-which(outlier_H_rem$var1 %in% outliers),]
var1<-outlier_H_rem1$var1}
boxplot(outlier_H_rem1$var1)
outliers <- boxplot(outlier_H_rem1$var2, plot=FALSE)$out
if(length(outliers) == 0){ outlier_H_rem2<-outlier_H_rem1
boxplot(outlier_H_rem2$var2)} else {
outlier_H_rem2<-outlier_H_rem1[-which(outlier_H_rem1$var2 %in% outliers),]
moisture2<-outlier_H_rem2$var2}
boxplot(outlier_H_rem2$var2)
outlier_H_rem is the stacked data frame I am testing each time with next var outlier_H_rem1$var1, outlier_H_rem2$var2, outlier_H_rem3$var3 till last var. outlier_H_rem15$var15 is the last stacked data frame that is treated with all variables.
