0
votes

I have a text file that appears tab-delimited but some of the lines have two tabs between columns. When I read into R everything looks great until I hit one of these lines and then breaks down.

My guess is that I need something to say that if one tab follows another tab the second one should be ignored.

I've tried using read.table with and without a sep="\t" as well as read_table.

data <- read.table("frog.txt",sep="\t", skip = 9, header=TRUE)

What I should get out of this is:

|Ind  |Ad    |Brand  |Net  |Date  |Program  |Genre  |Metric|
|167  |Widg  |Beta   |UPN  |1/1   |Bob      |Anim   |100   |
|168  |Widg  |Gamma  |TNN  |2/2   |Burger   |Anim   | 50   |
|169  |Cog   |Beef   |TLA  |3/3   |Cheers   |Com    |199   |

But what I'm getting is

|Ind  |Ad    |Brand  |Net  |Date  |Program  |Genre  |Metric|
|167  |Widg  |Beta   |UPN  |1/1   |Bob      |Anim   |100   |
|168  |Widg  |Gamma  |TNN  |2/2   |Burger Anim 50          |
|Cog Beef TLA 3/3 Cheers Com 199                           |
2
can you share the .txt document ? the solution is probably to use readLines() in order to correct the tabs, or make a different operation if there are two, as you suggested. statistical-programming.com/r-readlines-exampledemarsylvain
Sadly super-duper sensitive (and enormous). I had tried readLines, too--forgot about that one since it really didn't get me far at all. I'll take a look at the link, though.ajbentley

2 Answers

1
votes

One quick solution is to convert all your double tabs to single tabs:

library(data.table)
data <- readLines("frog.txt")
data <- gsub("\t\t", "\t", data)
data <- fread(text=data, sep="\t", skip = 9, header=TRUE)
0
votes

As long as there is no whitespace within fields then I think you made some other error since just omitting sep is sufficient. For example:

read.table(text = "1\t\t2\t3")
##   V1 V2 V3
## 1  1  2  3