3
votes

I'm new to R and was wondering if someone could explain why when a row is added to an empty data.frame after the columns have been named, the column names are renamed. This does not happen if the data.frame has a row added before the columns are named or if an empty row is included when defining the data.frame.

Column names defined before row addition (observe the new column names, 'X.a. X.b.'):

df1 <- data.frame(character(), character(), stringsAsFactors = FALSE)
colnames(df1) <- c("one", "two")
df1 <- rbind(df1, c("a", "b"))
df1
#  X.a. X.b.
#1    a    b

Row added before column defined:

df2 <- data.frame(character(), character(), stringsAsFactors = FALSE)
df2 <- rbind(df2, c("a", "b"))
colnames(df2) <- c("one", "two")
df2
#  one two
#1   a   b

Column names defined before row addition in a data frame defined with one empty row:

df3 <- data.frame(character(1), character(1), stringsAsFactors = FALSE)
colnames(df3) <- c("one", "two")
df3 <- rbind(df3, c("a", "b"))
df3
#  one two
#1        
#2   a   b
1
What is the use-case of creating a nameless (but not column-less) data.frame? Though I can't argue that it seems a little "off", I can't imagine when it would be triggered.r2evans
Well spotted. I was using the empty data.frame with columns and using rbind before I realised that the code I was translating into R would be friendlier to me if I used an index. Now I use df[i,] <- c("a", "b") and there's nothing strange that I can see. As I stated in my question, I am new to R and thought rbind meant I could avoid assigning an index to this unusual data set. It's unusual because it generated algorithmically. Why this would need to be tabulated is another question!MellifluousMelt

1 Answers

1
votes

Normaly, data.frames can be joined only if they have the same colnames.

data1 <- data.frame(x = 1, y = 1)
data2 <- data.frame(x = 2, y = 2)
rbind(data1, data2)

Otherwise, you will get an error.

data1 <- data.frame(xa = 1, ya = 1)
data2 <- data.frame(x = 2, y = 2)
rbind(data1, data2)
# Error in match.names(clabs, names(xi)) : names do not match previous names

However, if one of the data.frames is empty, the non-empty data.frame will govern the features of the new data.frame.

data1 <- data.frame(x = numeric(), y = numeric())
data2 <- data.frame(xa = 2, ya = 2)
rbind(data1, data2)

data1 <- data.frame(xa = 2, ya = 2)
data2 <- data.frame(x = numeric(), y = numeric())
rbind(data1, data2)

In your case c("a", "b") is coerced to data.frame before joining it with the other data.frame. Then it creates an automatic colnames for the coerced data.frame and it will govern the features of the new data.frame given that it is not empty.