Tidy multiple observations per row in tidyverse

Question

I have data with multiple observations per row that I would like to tidy.

The data is the Relative Spectral Response (RSR) for a collection of satellite sensors. I made a toy dataset that is representative of the actual data here. Each sensor/band has two columns, one with the range of wavelengths on which it was tested ("Wvln(nm)"), and one with the response of that sensor ("RSR"). In my toy data band 1 has a strong response to 500 nm, and band 2 has a strong response to 600 nm.

library(tidyverse)

rsr_toy <- tibble::tribble(
  ~`Band 1`, ~`...2`, ~`Band 2`, ~`...4`,
  "Wvln(nm)", "RSR", "Wvln(nm)", "RSR",
  "500", "0.9", "500", "0.01",
  "600", "0.12", "600", "0.8"
)

# remove the first row containing metadata 
rsr1 <- rsr_toy %>% 
  slice(-1) %>% 
  janitor::clean_names("small_camel") %>%
rsr1
# # A tibble: 2 x 4
#   band1 x2    band2 x4   
#   <chr> <chr> <chr> <chr>
# 1 500   0.9   500   0.01 
# 2 600   0.12  600   0.8

I want to tidy the data, so each observation gets its own row, like so:

# desired outcome:
tibble::tribble(
  ~sensor, ~wavelength, ~rsr,
  "band1", 500, 0.9,
  "band1", 600, 0.12,
  "band2", 500, 0.01,
  "band2", 600, 0.8
)
# # A tibble: 4 x 3
#   sensor wavelength   rsr
#   <chr>       <dbl> <dbl>
# 1 band1         500  0.9 
# 2 band1         600  0.12
# 3 band2         500  0.01
# 4 band2         600  0.8

How to simply accomplish this using tidyr?

If the names were consistently in the pattern of band1 x1 band2 x2 you could pivot it like rsr1 %>% pivot_longer(everything(), names_to=c(".value","sensor"), names_pattern="(.+)(\\d+)") - is there any chance a little cleaning can be done on the names ahead of time? — thelatemail

Ronak Shah Ronak Shah · Accepted Answer · 2021-02-19T02:49:14

Get the data in long format, rename the columns and create sensor column.

library(tidyverse)

rsr1 %>%
  pivot_longer(cols = everything(), 
               names_to = '.value', 
               names_pattern = '(.*?)\\d') %>%
  rename(wavelength = band, rsr = x) %>%
  mutate(sensor = rep(str_subset(names(rsr1), 'band'), length.out = n()),
         .before = 1)

#  sensor wavelength rsr  
#  <chr>  <chr>      <chr>
#1 band1  500        0.9  
#2 band2  500        0.01 
#3 band1  600        0.12 
#4 band2  600        0.8

Tidy multiple observations per row in tidyverse

2 Answers