Here is a small example and to make it reproducible we will use the built-in iris data set and save it 3 times to our working directory with filenames 'iris1.csv', 'iris2.csv', and 'iris3.csv'. Additionally, we can also save the relative paths to the file as well to a .txt file called 'all_my_files.txt' (also just 'iris1.csv', 'iris2.csv', and 'iris3.csv'). We can then read the file paths back in from the 'all_my_files.txt' and subsequently read the data associated with them.
data.table + loop solution
library(data.table)
library(tidyverse)
#make filenames
filenames <- paste0("iris", 1:3, ".csv")
#save iris dataset three time naming them 'iris1.csv', 'iris2.csv' etc
walk(filenames, ~write_csv(iris, path = .x))
#save the filepath
writeLines(filenames, "all_my_files.txt")
#read all the filepaths back in from text file
get_filenames_from_file <- readLines("all_my_files.txt")
files <- list()
mean_v1 <- vector()
for (i in 1:length(get_filenames_from_file)){
dat <-fread(get_filenames_from_file[[i]])
files[[i]] <- dat
#get mean of a column
mean_v1[i] <- mean(dat$Sepal.Length)
}
Full tidyverse solution:
library(tidyverse)
#make filenames
filenames <- paste0("iris", 1:3, ".csv")
#save iris dataset three time naming them 'iris1.csv', 'iris2.csv' etc
walk(filenames, ~write_csv(iris, path = .x))
#save the filepath
writeLines(filenames, "all_my_files.txt")
#read all the filepaths back in from text file
get_filenames_from_file <- readLines("all_my_files.txt")
#read the data in from the filepaths
data <- map(get_filenames_from_file, read_csv)
Either case we know have a list of 3 iris data frames:
str(data)
List of 3
$ : tibble [150 × 5] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
..$ Species : chr [1:150] "setosa" "setosa" "setosa" "setosa" ...
..- attr(*, "spec")=
.. .. cols(
.. .. Sepal.Length = col_double(),
.. .. Sepal.Width = col_double(),
.. .. Petal.Length = col_double(),
.. .. Petal.Width = col_double(),
.. .. Species = col_character()
.. .. )
$ : tibble [150 × 5] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
..$ Species : chr [1:150] "setosa" "setosa" "setosa" "setosa" ...
..- attr(*, "spec")=
.. .. cols(
.. .. Sepal.Length = col_double(),
.. .. Sepal.Width = col_double(),
.. .. Petal.Length = col_double(),
.. .. Petal.Width = col_double(),
.. .. Species = col_character()
.. .. )
$ : tibble [150 × 5] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
..$ Sepal.Length: num [1:150] 5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
..$ Sepal.Width : num [1:150] 3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
..$ Petal.Length: num [1:150] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
..$ Petal.Width : num [1:150] 0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
..$ Species : chr [1:150] "setosa" "setosa" "setosa" "setosa" ...
..- attr(*, "spec")=
.. .. cols(
.. .. Sepal.Length = col_double(),
.. .. Sepal.Width = col_double(),
.. .. Petal.Length = col_double(),
.. .. Petal.Width = col_double(),
.. .. Species = col_character()
.. .. )
read.csvreturns a data frame. If you don't have multiple columns inall_my_files.txt, you might wantreadLinesto return a character vector, which would make the rest of your syntax correct. However you'll also probably want to useiin your outputs as well, so you don't just overwritexeach iteration of the loop. (And you probably need quotes around"all_my_files.txt") - Gregor Thomas