I have a problem with the read.table function in R. I know this might be a common problem but a thorough search of this forum and the web in general has not helped me fix it. I have a .txt file consisting of 253 columns and 458800 rows deliminated by tabs. I am trying to read it into R using this code:
>data<-read.table("file.txt, header=TRUE,nrows=100,sep="\t")
>names<-colnames(data)
>classes<-sapply(data[1,],class)
>data<-read.table("file.txt",colClasses=classes,col.names=names,header=TRUE,nrows=460000,sep="\t",fill=TRUE)
However when I use the sep="\t"argument, R skips about half of the rows randomly, loading only 240000 rows, if I don't use the sep="\t" argument it loads all rows, but the columns are incorrect. it also gives a warning message stating that the number of columns in col.names doesn't match the number of headers in header=TRUE
I think the problem might be that in the .txt file, some fields are blank, these fields are just empty, so no spaces, NA or anything, for example:
field1"\t"field2"\t""\t"field4"\t" (field 3 is empty)
I got the file from a third party and am not in a position to make any chages to it. Can anyone help me to fix this problem?
Thanks in advance,
Tim
read.delim? I don't know how that handles empty fields but you could give it a try. - talatfillargument inread.delim, perhaps it works if you specifyfill = TRUE. - talat