Import of data set using read.table where data is uneven

Question

I took the mtcars data set from MASS and made a few modifications. After going from one package to another I finally got the data into notepad where I carefully made sure that spaces were the delimiter. My problem is that the file thus created will not read in. I have a test file which reads in just fine.

Can you explain what the error message is telling me? Thank you. Error messages and code used is given below.

TEST.txt
120 140 7.5
140 150 8.5

mtcars2=read.table(file="TEST.txt",header=FALSE)
mtcars2
#    V1  V2  V3
# 1 120 140 7.5
# 2 140 150 8.5  OK

mtcars2.txt problem dataset

160 110 2.62
160 110 2.875
160 110 2.32
160 110 3.215
160 115 3.44

mtcars2=read.table("c:\\data\\mtcars2.txt",header=FALSE)

Warning messages: 1: In read.table("c:\data\mtcars2.txt", header = FALSE) : line 1 appears to contain embedded nulls ... 6: In scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : embedded nul(s) found in input NOT OK

I also tried the following:

mtcars2=read.table("c:\\data\\mtcars2.txt",fill=T,header=FALSE)

I can not re-create your problem. I do get a warning: incomplete final line.. but it still reads just fine. I copied your mtcars2.txt onto notepad and saved, then used your read.table... above and it more or less worked for me. I would go and see if you have any whitespace behind each of your lines the .txt. files. — Bryan Goggin
Copying and pasting your data works fine for me, though you may have weird Unicode invisible characters in your data. Also, assuming you meant that dataset worked fine, you should move the "OK", as that would create a problem for read.table. — alistaire
unicode text was used along the way as i converted from one package to another. What worked for me was to copy my data set, paste it into the question box on stackoverflow, copy pasted data and put into a new notepad file. — Mary A. Marion

Marichyasana Marichyasana · Accepted Answer · 2016-07-02T18:46:12

I used cut/paste and it works fine. I then put a null just before the first space and I got:

line 1 appears to contain embedded nulls.

I suspect you modified the file such that the values you output are like 'c' strings with terminating zeros; or Unicode (16 bit) would also give trouble because it has zeros.
One thing you can do to check the contents of every byte in the file is to use UNIX/Linux od program:

od -c filename

sample output:

0000000000     1   6  \0  \0       1   1  \0       2   .   6   2  \r  \n   1
0000000020     6  \0       1   1  \0       2   .   8   7   5  \r  \n   1   6
0000000040    \0       1   1  \0       2   .   3   2  \r  \n   1   6  \0

Import of data set using read.table where data is uneven

2 Answers