1
votes

using the command:

read.csv("someFile.csv", nrows=100); 

I can load the first 100 rows of a given CSV file. I am wondering whether R loads the whole file and shows only the first 100 rows or does it load only the requested rows to memory. If this is true, live memory usage might be reduced, which is important when working in a PC. So is it?

EDIT: checking the help ?read.csv , section memory usage, says that:

Using nrows, even as a mild over-estimate, will help memory usage.

but it doesn't say how.

1
It loads only the requested rows.Scott Ritchie
can you point me to any docs on that, thanksMohamed Ali JAMAOUI
I don't have any documentation, but loading in real world data (both the amount of time i takes to read, and the amount of RAM it uses) indicate this to be the case.Scott Ritchie
?read.csv says : There is extensive discussion in the ‘R Data Import/Export’ manual. Sometimes it is as easy as reading the manual...Joris Meys

1 Answers

3
votes

Look at the code: the actual loading is implemented by

data <- scan(file = file, what = what, sep = sep, quote = quote, 
    dec = dec, nmax = nrows, skip = 0, na.strings = na.strings, 
    quiet = TRUE, fill = fill, strip.white = strip.white, 
    blank.lines.skip = blank.lines.skip, multi.line = FALSE, 
    comment.char = comment.char, allowEscapes = allowEscapes, 
    flush = flush, encoding = encoding)

scan() will only read nrows lines.