I got the error when I wanted to set the first column as the row names:
dt <- fread('../data/data_logTMP.csv', header = T)
rownames(dt) <- dt$GENE
I used duplicated()
to check the values:
> which(duplicated(dt$GENE) == TRUE)
[1] 20209 21919
Therefore, I compared these values:
> dt$GENE[20209] == dt$GENE[21919]
[1] FALSE
> dt$GENE[20209]
[1] "1-Mar"
> dt$GENE[21919]
[1] "2-Mar"
Why were these two values recognized as duplicated? And how can I fix this problem?
is.data.frame
will beTRUE
becausedata.table
is an extension ofdata.frame
(you can look atclass(dt)
). row names is one place where the extension is a bit muddied;data.table
prefers to use keys instead of row names; see this vignette – MichaelChirico