I'm using fread to read a 179mb CSV file with 16 columns and 637501 rows. fread is not reading the first 29 lines of the CSV file. It misses the headers in the first line as well. I have used
fread("filename.csv",sep= ",")
fread("filename.csv",sep= "," , skip>=0L)
fread("filename.csv",sep= "," , skip>=1L)
fread("filename.csv",sep= ",", autostart=1L)
When I set header =TRUE, the row 30 is set as the header but fread fails to recognize the first 29 rows. I am able to read the read the same file read.csv without any issues (only it takes a lot longer).
Is this a bug or am I missing something?
Link to a sample CSV that produces the same bug (20kb) https://dl.dropboxusercontent.com/u/17747104/example.csv
Here's the link to the 179mb file. https://dl.dropboxusercontent.com/u/17747104/read.csv
read.table
has a feature that will automatically add blank fields if you provide a malformed csv file with different number of fields per row. I'm not sure how to handle this with fread short of modifying the file itself. – jorandata.table
1.8.11+, I'd dofread("awk 'BEGIN{OFS = FS = \",\"}{$36 = $36; print}' yourfile.csv")
(replace 36 with whatever the right number of columns is) – eddi