0
votes

I have 2 matrix with same rownames and I found the first 3 minimum value for each rows (thanks to @ Maurits Evers). Now I need to find the 3 minimum value but without repeated colnames. I want unique Ids while calculating minimum, for example, if Inst1 picked Alt5 as minimum, the next row (Inst2) should not pick the same. Hence, I should skip if they have been chosen before as minimum and look for the next minimum and report them. But if the 2nd row with minimum is lower than 1st rows minimums then the 1st row will be replaced with the next minimim value of that particular row. Any suggestions?

set.seed(2017)
X <- matrix(runif(20), nrow=4)
rownames(X) <- paste0("Inst", seq(nrow(X)))
colnames(X) <- paste0("Ref", seq(ncol(X)))

Y <- matrix(runif(20), nrow=4)
rownames(Y) <- paste0("Inst", seq(nrow(Y)))
colnames(Y) <- paste0("Alt", seq(ncol(Y)))

cbind.data.frame(X, Y) %>%
    rownames_to_column("row") %>%
    gather(Id, Minimum, -row) %>%
    group_by(row) %>%
    top_n(-3, Minimum) %>%
    arrange(row, Minimum)

## A tibble: 12 x 3
## Groups:   row [4]
#   row   Id    Minimum
#   <chr> <chr>   <dbl>
# 1 Inst1 Ref4  0.0251
# 2 Inst1 Alt3  0.0763
# 3 Inst1 Alt5  0.129
# 4 Inst2 Alt5  0.110
# 5 Inst2 Alt4  0.212
# 6 Inst2 Alt3  0.261
# 7 Inst3 Ref2  0.0393
# 8 Inst3 Alt5  0.177
# 9 Inst3 Ref1  0.469
#10 Inst4 Ref3  0.00202
#11 Inst4 Alt3  0.0175
#12 Inst4 Ref1  0.289

#Expected output:
 Inst Min1 1 Min2 2 Min3 3
 Inst1 Ref4  0.0251 Alt2  0.228 Ref5 0.395375 
 Inst2 Alt5  0.110 Alt4  0.212 Alt1 0.380 
 Inst4 Ref3  0.00202 Alt3  0.0175 Ref1  0.289
1

1 Answers

0
votes

Thank you for question and comments. A bit strait forward solution: to have a loop with selecting minimum which is not in previous and recording previously selected Minimums to define which is the minimum for given ID. For formating spread function could be used

      mindata <-as.data.frame(cbind.data.frame(X, Y) %>%
      rownames_to_column("row") %>%
      gather(Id, Minimum, -row) %>%
      arrange(row, Minimum))

Pr_min <- c()
Def_min <- c()
par<-sum(mindata$row=="Inst1")
i<-1

  for (row in unique(mindata$row)){
    initial3 <- head(mindata[mindata$row==row & !(mindata$Id %in% Pr_min$Id), c(1:3) ], 3)
   if(nrow(initial3)==0){next} 
    New_min <-mindata[(1 +(i-1)*par):max(rownames(initial3)), 2:3]

    Pr_min <- rbind(Pr_min, initial3)
    Def_min <- rbind(Def_min, New_min)
    i <- i+1
  }

  Final <- Pr_min %>%
  left_join((Def_min%>%
  group_by(Id)%>%
  summarise(min=min(Minimum))), by="Id")%>%
  select(-Minimum)

CN <- rep(c("Min1", "Min2", "Min3"), ceiling(length(Final$Id)/3))[1:nrow(Final)]
Val_min <-Final%>% cbind( CN) %>% select(-Id)%>% spread(CN, min)
name_min <-Final%>% cbind( CN ) %>% select(-min)%>% spread(CN, Id)
cbind(name_min,Val_min[, 2:4])