4
votes

I am trying to return a dataframe containing all of the values in each column that are less than -1.5 that has both the column header and the rowname. I basically have everything worked out, except in the final step where I replace a column that has the column numbers with the corresponding column names from the original dataframe when there are multiple values from the same column that are less than -1.5 the new column name values are listed as "column name.1". I have searched around and found out that make.unique appears to do a similar thing, but I never called that function.

A <- c(0.6, -0.5, 0.1, 1.6, -1.6, 0.4, -1.6)
B <- c(0.7, -2.1, -0.3, 1.1, 2.1, -1.7, 1.1)
DF <- as.data.frame(cbind(A, B))
colnames(DF) <- c("010302A620300302000", "010803A110100069000")
rownames(DF) <- c("1996", "1997", "1998", "1999", "2000", "2001", "2002")

So my original dataframe looks something like this:

010302A620300302000 010803A110100069000
1996                 0.6                 0.7
1997                -0.5                -2.1
1998                 0.1                -0.3
1999                 1.6                 1.1
2000                -1.6                 2.1
2001                 0.4                -1.7
2002                -1.6                 1.1

In order to get the relevant values for each row:

DF.new <- as.data.frame(which(DF <= -1.5, arr.ind = T, useNames = TRUE))
DF.new <- as.data.frame(setDT(DF.new, keep.rownames = TRUE)[])

DF.new$SUID <- colnames(DF[, DF.new[ ,3]])

This brings me to the problem, how do I use the colnames function so that the resulting SUID column does not append ".1" to repeat character vectors like I see here:

    rn row col                  SUID
1 2000   5   1   010302A620300302000
2 2002   7   1   010302A620300302000.1
3 1997   2   2   010803A110100069000
4 2001   6   2   010803A110100069000.1

Thanks in advance!

2
Generally, it's inadvisable to use as.data.frame. Usually better to just use data.frame and (OMG!) STOP using cbind inside either of those functions. BAD DOG. Or perhaps? BAD Teacher?IRTFM
Talk about turgid code. Putting setDT(...) inside as.dataframe is rather convoluted.IRTFM

2 Answers

4
votes

Subset your column names from a character vector rather than columns from a new data frame.

like this

DF.new$SUID <- colnames(DF)[DF.new[ ,3]]

instead of this

DF.new$SUID <- colnames(DF[, DF.new[ ,3]])
0
votes

A quick and dirty fix is DF.new$SUID <- floor(DF.new$SUID) to remove the decimals.