0
votes

I have a dataframe, whose column names happen to be numerically sequenced, and which are not converted to X1, X2, X3,.., after reading the file to the workspace.This leads to undesired ordering of the columns in ggplot2 (1,10,11,...,2,21,22)

I try to change the colnames, but whatever i do is ignored:

data <- read.table(file = "tabbed_text.txt", sep="\t", header=T, row.names=1)
str(data[,1:10])
'data.frame':   1208 obs. of  10 variables:
 $ 1 : int  1147 748 1147 944 841 938 513 645 577 309 ...
 $ 2 : int  2298 1017 1741 1380 1230 1460 696 1050 1006 442 ...
...
colnames(data[,1:10])
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"
paste0("V", c(1:10))
 [1] "V1"  "V2"  "V3"  "V4"  "V5"  "V6"  "V7"  "V8"  "V9"  "V10"
colnames(data[,1:10]) <- paste0("V", c(1:10))
colnames(data[,1:10])
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"
new.names <- c("I","do","not","understand","why", "this", "is", "happening", "to", "me")
colnames(data[,1:10]) <- new.names
colnames(data[,1:10])
 [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10"
str(names(data[,1:10]))
chr [1:10] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10"

Where do i fail?

Here is the dput output of the first 10x10 cells:

> dput(data[1:10,1:10])
structure(list(X1 = c(1147L, 748L, 1147L, 944L, 841L, 938L, 513L, 
645L, 577L, 309L), X2 = c(2298L, 1017L, 1741L, 1380L, 1230L, 
1460L, 696L, 1050L, 1006L, 442L), X3 = c(1239L, 634L, 1037L, 
979L, 766L, 624L, 557L, 503L, 425L, 337L), X4 = c(1180L, 393L, 
883L, 699L, 641L, 456L, 478L, 378L, 321L, 227L), X5 = c(1178L, 
650L, 892L, 889L, 767L, 660L, 384L, 547L, 457L, 318L), X6 = c(3135L, 
1137L, 1493L, 1371L, 1024L, 1103L, 846L, 753L, 728L, 425L), X7 = c(1989L, 
807L, 1368L, 1071L, 1154L, 1055L, 662L, 658L, 680L, 435L), X8 = c(4469L, 
1917L, 2524L, 2294L, 1834L, 2082L, 1181L, 1240L, 1392L, 825L), 
    X9 = c(394L, 553L, 666L, 900L, 707L, 673L, 503L, 511L, 478L, 
    323L), X10 = c(619L, 1550L, 2069L, 1710L, 2023L, 1473L, 1137L, 
    1041L, 1069L, 886L)), .Names = c("X1", "X2", "X3", "X4", 
"X5", "X6", "X7", "X8", "X9", "X10"), row.names = c(11541L, 11861L, 
985L, 4702L, 301L, 234L, 5876L, 2530L, 247L, 5843L), class = "data.frame")

Interestingly, the colnames seem to be internally stored in the "X-Integer"-format. I am going to change the headers in the original file, but i would like to understand what i am doing wrong here.

1

1 Answers

3
votes

You have a standard data.frame with integer columns. The column names are always of type character. You can't assign column names to a subset of the data.frame (well, you can, but they are lost immediately, which is why there is no error). Maybe you want to assign values to a subset of the column names, which would be done with colnames(data)[1:10] <- .... The format of your data.frame is not useful for ggplot2, since that package prefers long-format data.