4
votes

I have a .csv where a column of IDs contains a long integer with leading zeros. fread converts it into an integer64 type. How would I specify the class for one column and then just let fread automatically guess the classes for the remaining columns? Not sure if this is an "all-or-nothing" type of situation.

I have 50+ columns and would rather not have to specify the data types for all of them just because I have to do so for one of them.

My question is related to: R fread - read all columns as character.

2
What type do you want the IDs column to be ? - steveb
Whatever type preserves the fidelity of a number like, "001001001150000285723". So I guess character? - user2205916

2 Answers

8
votes

From ?fread:

# colClasses
data = "A,B,C,D\n1,3,5,7\n2,4,6,8\n"
fread(data, colClasses=c(B="character",C="character",D="character"))  # as read.csv
fread(data, colClasses=list(character=c("B","C","D")))    # saves typing
fread(data, colClasses=list(character=2:4))     # same using column numbers

That is, if your zero-padded column is called big_num, just use colClasses = list(character = 'big_num')

3
votes

Addressing the auto detection and overriding a specific column:

# Auto detect the column types (special case of using nrows=0)
colCls <- sapply(fread(fName, nrows=0), class)
# Override the "wrong" detected column types
colCls[c("field1", "field2")] <- "character"
dt<-fread(fName, colClasses = colCls)