0
votes

I'm new to programming in R and I have the following dataframe:

 A B C D E
1 3 0 4 5 0
2 0 0 5 1 0
3 2 1 2 0 3

I would like to get a new dataframe containing the indices of the n max values of each row, e.g: If I wanted the column indices of the 3 biggest values in each row (n=3), I want my new dataframe to be like this:

  F G H
1 1 3 4
2 1 3 4
3 1 3 5

So in the first row of this dataframe containts the column indices of the 3 biggest values of row 1 in the original dataframe. And so on.

My original idea was to write a loop with which.max, but that seems way too long and ineffective. Does anyone have a better idea?

1

1 Answers

1
votes

We can use apply

t(apply(df1, 1, function(x) sort(head(seq_along(x)[order(-x)], 3))))
#   [,1] [,2] [,3]
#1    1    3    4
#2    1    3    4
#3    1    3    5

Or using tidyverse

library(dplyr)
library(tidyr)
df1 %>%
   mutate(rn = row_number()) %>% 
   pivot_longer(cols = -rn) %>% 
   group_by(rn) %>% 
   mutate(ind = row_number()) %>% 
   arrange(rn, desc(value)) %>% 
   slice(n = 1:3)  %>% 
   select(-name, -value) %>% 
   arrange(rn, ind) %>%
   mutate(nm1 = c("F", "G", "H")) %>% 
   ungroup %>% 
   pivot_wider(names_from = nm1, values_from = ind)

data

df1 <- structure(list(A = c(3L, 0L, 2L), B = c(0L, 0L, 1L), C = c(4L, 
5L, 2L), D = c(5L, 1L, 0L), E = c(0L, 0L, 3L)), class = "data.frame",
row.names = c("1", 
"2", "3"))