1
votes

I'm trying to read a .csv file that looks like this with quotes in every cell:
"a","b"
"1","hello"
"2","hello, test"

Using read.csv() it works fine with column "a" of type integer. With data.table::fread() column "a" is of type character, though.

x <- fread("\"a\",\"b\"\n\"1\",\"hello\"\n\"2\",\"hello, test\"")
summary(x)  

    a                  b            
Length:2           Length:2          
Class :character   Class :character  
Mode  :character   Mode  :character 

Is there a way to tell fread to determine the column types in fully quoted .csv files?

1
You could specify the column classes with the colClasses-parameter. See ?fread.Jaap
To specify the types via the colClasses-parameter I would need to know them in advance, but I don't. The small example above is just for illustration what the format looks like. My actual input file has hundreds of columns.Steffen J.
You can use x[, names(x) := lapply(.SD, type.convert)] to convert after the fact. Maybe that step should be added as an option to fread...Frank
indeed, this works, with as.is = TRUE. thanks!Steffen J.
Will look into it: #1487Matt Dowle

1 Answers

2
votes
x <- fread("\"a\",\"b\"\n\"1\",\"hello\"\n\"2\",\"hello, test\"")
x[, names(x) := lapply(.SD, type.convert, as.is = TRUE)]
summary(x)

    a             b            
Min.   :1.00   Length:2          
1st Qu.:1.25   Class :character  
Median :1.50   Mode  :character  
Mean   :1.50                     
3rd Qu.:1.75                     
Max.   :2.00