reading a csv file with repeated row names in R

24

votes

I am trying to read a csv file with repeated row names but could not. The error message I am getting is Error in read.table(file = file, header = header, sep = sep, quote = quote, : duplicate 'row.names' are not allowed.

The code I am using is:

S1N657 <- read.csv("S1N657.csv",header=T,fill=T,col.names=c("dam","anim","temp"))

An example of my data is given below:

did <- c("1N657","1N657","1N657","1N657","1N657","1N657","1N657","1N657","1N657","1N657")
aid <- c(101,102,103,104,105,106,107,108,109,110)
temp <- c(36,38,37,39,35,37,36,34,39,38)

data <- cbind(did,aid,temp)

Any help will be appreciated.

r rownames

Does this answer your question? duplicate 'row.names' are not allowed error - Brian D

33

votes

the function is seeing duplicate row names, so you need to deal with that. Probably the easiest way is with row.names=NULL, which will force row numbering--in other words, it treats your first column as the first dimension and not as the row numbers, and so adds row numbers (consecutive integers starting with "1".

read.csv("S1N657.csv", header=T,fill=T, col.names=c("dam","anim","temp"), row.names=NULL)

3

votes

try this:

S1N657 <- read.csv("S1N657.csv",header=T,fill=T,col.names=c("dam","anim","temp"), 
          row.names = NULL)[,-1]

2

votes

Guessing your csv file was one converted from xlsx.Add a comma to the end of the first row ,remove the last row ,done

2

votes

An issue I had recently was that the number of columns in the header row did not match the number of columns I had in the data itself. For example, my data was tab-delimited and all of the data rows had a trailing tab character. The header row (which I had manually added) did not.

I wanted the rows to be auto-numbered, but instead it was looking at my first row as the row name. From the docs (emphasis added by me):

row.names a vector of row names. This can be a vector giving the actual row names, or a single number giving the column of the table which contains the row names, or character string giving the name of the table column containing the row names.

If there is a header and the first row contains one fewer field than the number of columns, the first column in the input is used for the row names. Otherwise if row.names is missing, the rows are numbered.

Using row.names = NULL forces row numbering. Missing or NULL row.names generate row names that are considered to be ‘automatic’ (and not preserved by as.matrix).

Adding an extra tab character to the header row made the header row have the same number of columns as the data rows, thus solving the problem.

1

votes

In short, check your column names. If your first row is the names of columns, you may be missing one or more names.

Example:

"a","b","c"
a,b,c,d
a,b,c,d

The example above will cause a row.name error because each row has 4 values, but only 3 columns are named.

This happened to me when I was building a csv from an online resources.

1

votes

I was getting the same "duplicate 'row.names' are not allowed" error for a small CSV. The problem was that somewhere outside of the 14x14 chart area I wanted there was a random cell with a space/other data.

Discovered the answer when I ran it "row.names = NULL" and there were multiple rows of blank data below my table (and therefore multiple duplicate row names all "blank").

Solution was to delete all rows/columns outside the table area, and it worked!

0

votes

in my case the problem came from the excel file. Although it seemed perfectly organized, it did not worked and I had always the message: Error in read.table(file = file, header = header, sep = sep, quote = quote, : duplicate 'row.names' are not allowed.

I tried to copy-paste my excel matrix to a new empty excel sheet and I retried to read it: it worked ! No error message anymore !

reading a csv file with repeated row names in R

7 Answers