Within a for loop, I am trying to run a function between two columns of data in my data frame, and move to another data set every interation of the loop. I would like to output every output of the for loop into one vector of answers.
I can't get passed the following errors (listed below my code), depending on if I add or remove row.names = NULL to data <- read.csv... part of the following code (line 4 of the for-loop):
** Edited to include directory references, where the error ultimately was:
corr <- function(directory, threshold = 0) {
The above code/ my unseen directory organzation was where my error was
lookup <- complete("specdata")
files <-list.files(full.names="TRUE") #read file names
len <- length(files)
answer2 <- vector("numeric")
answer <- vector("numeric")
dataN <- data.frame()
for (i in 1:len) {
if (lookup[i,"nobs"] > threshold){
# TRUE -> read that file, remove the NA data and add to the overall data frame
data <- read.csv(file = files[i], header = TRUE, sep = ",")
#remove incomplete
dataN <- data[complete.cases(data),]
#If yes, compute the correlation and assign its results to an intermediate vector.
answer2 <- c(answer2,answer)
setwd("../") return(answer2) }
1) Error in read.table(file = file, header = header, sep = sep, quote = quote, : duplicate 'row.names' are not allowed
2) Error in [.data.frame
(data, , 2:3) : undefined columns selected
What I've tried
- referring to the column names directly "colA"
- initializing data and dataN to empty data.frames before the for loop
- initializing answer2 to an empty vector
- Getting an better understanding on how vectors, matrices and data.frames work with each other
** Thank you!**