0
votes

I have a problem with the read.table function in R. I know this might be a common problem but a thorough search of this forum and the web in general has not helped me fix it. I have a .txt file consisting of 253 columns and 458800 rows deliminated by tabs. I am trying to read it into R using this code:

>data<-read.table("file.txt, header=TRUE,nrows=100,sep="\t")

>names<-colnames(data)

>classes<-sapply(data[1,],class)

>data<-read.table("file.txt",colClasses=classes,col.names=names,header=TRUE,nrows=460000,sep="\t",fill=TRUE)

However when I use the sep="\t"argument, R skips about half of the rows randomly, loading only 240000 rows, if I don't use the sep="\t" argument it loads all rows, but the columns are incorrect. it also gives a warning message stating that the number of columns in col.names doesn't match the number of headers in header=TRUE

I think the problem might be that in the .txt file, some fields are blank, these fields are just empty, so no spaces, NA or anything, for example:

field1"\t"field2"\t""\t"field4"\t" (field 3 is empty)

I got the file from a third party and am not in a position to make any chages to it. Can anyone help me to fix this problem?

Thanks in advance,

Tim

1
Did you try using read.delim? I don't know how that handles empty fields but you could give it a try. - talat
Also have a look at the fill argument in read.delim, perhaps it works if you specify fill = TRUE. - talat
Unfortunately, I allready tried the fill argument, it changed the columns but they were still wrong. - Tim.R
How about quote = "\"" in read.table? Does that have any effect? - r.bot

1 Answers

0
votes

Have you tried the package data.table? It has a method fread which detects the separator. You can try it like this:

library(data.table)
data<-fread("file.txt)

Let me know if this helps.

Thanks