How to read multiple csv files with Reduce in R

Question

I'm trying to extract one column from multiple .csv files using Reduce. What I have is

a vector with the path to every single .csv

filepaths

a function to read a .csv and return one of it's columns

getData <- function(path,column) {
   d = read.csv(path)
   d[,column]
}

and the Reduce function, to apply the getData function to every single filepath and store the results in a single collection (for demonstration I only take the first three path strings)

Reduce(function(path,acc) append(acc, getData(path,column)), filepaths[1:3],c())

If I do this, I get the following error, which occurs, when read.csv is called with one of the filepaths

Error in read.table(file = file, header = header, sep = sep, quote = quote, : 'file' must be a character string or connection

This is strange, cause if I call the "getData" function manually like

getData(filepaths[1],col)
getData(filepaths[2],col)
getData(filepaths[3],col)

it works.

I know, I could do this with a for loop. But I want to understand, what the problem is.

try do.call(rbind,lapply(filepaths, fread, select="colname")) — mtoto
you can do this too with your function unlist(lapply(filepaths, function(x){ getData(x,1) })) will read first column. — fishtank
Why read the whole .csv and then extract only one column? inefficient. fread for example has a select argument... — MichaelChirico

mtoto mtoto · Accepted Answer · 2016-02-18T19:33:37

You could use fread from data.table to read in only the desired column, instead of reading in entire csv's and consequently dropping all columns but one, as in your function.

library(data.table)
unlist(lapply(filepaths, fread, select= "colname")) #output is a vector

How to read multiple csv files with Reduce in R

4 Answers