I am trying to run a correlation between all numberic values (the dataset contains columns of both numeric and non-numeric values) using the following code:
mydata= read.csv("C:\\full_path\\playerData.csv", header = TRUE)
mydata=data.frame(mydata)
vals=cor(mydata, use="complete.obs", method="pearson")
write.csv(vals,"C:\\Users\\weiler\\Desktop\\RStudioOutput.csv")
based on this site: http://www.statmethods.net/stats/correlations.html I am getting the error:
Error in cor(mydata, use = "complete.obs", method = "pearson") : 'x' must be numeric
My error seems to be because some of the data is non-numeric. Is there a simple way to ignore the non-numeric data?
mydata[sapply(mydata, is.numeric)]
– David Arenburg