I am experimenting with different packages to find the best suit to save data files such as csv ones fast.
I have found 'iotools' package and the method 'write.csv.raw' that is pretty good to save data concerning the time lapsed.
However the dataset in the file saved has some controversial features:
- no column names;
- double/float numbers are with decimal sign "." but not with "," .
So I need to have dataset in the file saved to be with column names and the correct decimal sign.
My script as follows:
library(iotools)
library(UsingR)
data(galton)
head(galton)
#option1 to save data
write.csv.raw(galton,"test.csv",append=FALSE,sep=";",col.names=TRUE)
#option2 to save data
write.table.raw(galton,"test.csv",append=FALSE,sep=";",col.names=TRUE)
read.csv2("test.csv",nrow=5)
the input dataset (from R):
child parent
61.7 70.5
61.7 68.5
61.7 65.5
61.7 64.5
61.7 64.0
62.2 67.5
the output file:
X1.61.7 X70.5
2\t61.7 68.5
3\t61.7 65.5
4\t61.7 64.5
5\t61.7 64
6\t62.2 67.5
Update of 18/02/16:
with help of the answer by procrastinator0 I have managed to use 'write.csv.raw' in correct manner.
The comparison of different write-methods based upon the dataframe from the question section as follows:
system.time(write.csv.raw(n,"test.csv",sep=";",append=TRUE))
user system elapsed
15.61 1.17 21.92system.time(write.table(n,"test.csv",sep=";",row.names=FALSE,dec=","))
user system elapsed
63.25 1.20 64.60system.time(write.csv2(n,"test.csv",row.names=FALSE))
user system elapsed 63.71 1.28 65.38system.time(write_csv(n, "test.csv", na = "NA")) user system elapsed
136.75 3.60 141.24
Update of 27/04/16: (out of date)
I have done some experiment runs to write/read data (different tools). Experiments are based on the theoretical sample as well as the real one (from my practice). I have tried to make reproducible scripts. Hope they will be useful for newcomers :-)
Links to IO experiments:
Reading data from files: https://rpubs.com/demydd/166375
Writing data to files: https://rpubs.com/demydd/170957
Update of 19/09/16:
feather package is added (read_feather, write_feather)
fwrite is added from data.table package.
links to updated tests:
col.names=TRUE
. Data is not controversial, what is the question? – zx8754write.csv.raw
faster thanwrite_csv{readr}
? You question is about which is the fastest method to write a.csv
file, right? – rafa.pereira