0
votes

From two columns soldier and superior I am trying to create a list column of the superior ranks above each soldier that I can later unnest to form long data will all ranks above each soldier's rank. So for "Srg" the values would be "Lt, Maj, Col, Gen", and for "Maj" the value would be "Col, Gen".

Currently the only way I've found to apply this for-loop to the entire dataframe using purrr::pmap() requires me to hardcode in variable and dataframe names into the for-loop.

Is there a way I could extract these ranks in a more generalizable function that took data and variable names as arguments?

library(dplyr)
library(tidyr)
library(purrr)

# Create test data
data <-
  dplyr::tibble(
    soldier = c("Srg", "Lt", "Maj", "Col", "Gen"),
    superior  = c("Lt", "Maj", "Col", "Gen", NA)
  )

# Define custom function
get_ranks_above <- function(id, max_steps = 5){

  ranks_above <- vector("list", length = max_steps)

  for (i in 1:max_steps) {
    ranks_above[[i]] <- 
      data.frame(
        superior_list = data$superior[data$soldier == id]
      )

    id <- ranks_above[[i]]$superior_list
  }

  do.call(rbind, ranks_above)
}

# Apply custom function
data_ranked <- 
  data %>%
  mutate(
    ranks_above = pmap(
      list(id = soldier), 
      get_ranks_above
    )
  )

# Unnest list column and add numeric ranks
data_ranked %>% 
  unnest(ranks_above) %>% 
  drop_na() %>% 
  group_by(soldier) %>% 
  mutate(rank_from_top = seq(n(),1)) %>% 
  ungroup()

When I try to write the custom function get_ranks_above() with arguments for data and variable names I get an error message: Error in mutate_impl(.data, dots) : Evaluation error: Element 1 has length 2, not 1 or 5..

get_ranks_above <- function(data, id = soldier, lower_rank = data$soldier, upper_rank = data$superior, max_steps = 5){

  ranks_above <- vector("list", length = max_steps)

  for (i in 1:max_steps) {
    ranks_above[[i]] <- 
      data.frame(
        superior_list = upper_rank[lower_rank == id]
      )

    id <- ranks_above[[i]]$superior_list
  }

  do.call(rbind, ranks_above)
}

data_ranked <- 
  data %>%
  mutate(
    ranks_above = pmap(
      list(
        data = data, 
        id = soldier, 
        lower_rank = data$soldier, 
        upper_rank = data$superior, 
        max_steps = 5
      ), 
      get_ranks_above
    )
  )
2

2 Answers

0
votes

I think the simplest way to do what you are trying to do is to use an ordered factor to compare ranks with each other. Here I can use parse_factor to create the ordered factor, using ranks as the levels and specifying that the levels should be ordered as given (note that ranks is already in order). That makes the superior easy to determine, we just go through the ranks and check which ones are > than our current soldier, and subset ranks accordingly. Then we can unnest as desired and have a our long form data.

library(tidyverse)
ranks <- c("Srg", "Lt", "Maj", "Col", "Gen")
set.seed(12345)
some_soldiers <- tibble(
  soldier = sample(ranks, 5)
)
some_soldiers
#> # A tibble: 5 x 1
#>   soldier
#>   <chr>  
#> 1 Col    
#> 2 Gen    
#> 3 Maj    
#> 4 Lt     
#> 5 Srg

some_soldiers %>%
  mutate(
    soldier = parse_factor(soldier, levels = ranks, ordered = TRUE),
    superior = map(soldier, ~ ranks[which(ranks > .x)])
  ) %>%
  unnest()
#> # A tibble: 10 x 2
#>    soldier superior
#>    <ord>   <chr>   
#>  1 Col     Gen     
#>  2 Maj     Col     
#>  3 Maj     Gen     
#>  4 Lt      Maj     
#>  5 Lt      Col     
#>  6 Lt      Gen     
#>  7 Srg     Lt      
#>  8 Srg     Maj     
#>  9 Srg     Col     
#> 10 Srg     Gen

Created on 2018-08-21 by the reprex package (v0.2.0).

0
votes

I would tackle this problem with ordered factors. Once you have a table with all the information, you can easily merge it to any data frame. General idea:

library(dplyr)
library(purrr)
sld_levels <- c("Srg", "Lt", "Maj", "Col", "Gen")
tibble(sld_rank = factor(sld_levels, 
                         levels = sld_levels, 
                         ordered = TRUE)) %>% 
  mutate(rank_above = map(.x = sld_rank, ~sld_rank[.x < sld_rank]))