0
votes

Given the following data frame:

df <- data.frame(c("1990, 1991", "1997, 2004", "2005"), c("1991, 1999", "1994", "1995, 2011"))

I want to create a 3rd column to the right in the data frame, which pastes the lowest year in the row based on the two columns.

An element that for instance shows "1990, 1991" is supposed to indicate two different years that are to be looked at separately.

So in the first row, R will analyse the years:

1990, 1991, 1991 and 1999 and then write 1990 in the third column since it's the lowest out of them all.

The final table should look like this:

df <- data.frame(c("1990, 1991", "1997, 2004", "2005"), c("1991, 1999", "1994", "1995, 2011"), c("1990", "1994", "1995"))
4

4 Answers

2
votes

Here's an apply approach

df$result <- apply(df, 1, function(x) min(as.numeric(unlist(strsplit(paste(x, collapse=", "), ", ")))))

Collapse the 2 columns into a single string using

paste(x, collapse=", ")

Split the resulting string into a vector

unlist(strsplit(..., ", "))

Find minimum number

min(as.numeric(...))
0
votes

A solution using splitstackshape::cSplit function along with dplyr as:

library(splitstackshape)
library(dplyr)

df$minval <- df %>% cSplit(c("one", "two")) %>%
  mutate_if(is.character, as.numeric) %>%
  mutate(minval = apply(., 1,min, na.rm = TRUE)) %>%
  select(minval)

df
#          one        two minval
# 1 1990, 1991 1991, 1999   1990
# 2 1997, 2004       1994   1994
# 3       2005 1995, 2011   1995

Data: I have changed sample data to provide column names. (which is not needed for solution but it helps in aesthetic look of answer)

df <- data.frame(one = c("1990, 1991", "1997, 2004", "2005"), 
               two = c("1991, 1999", "1994", "1995, 2011"))
0
votes

This is another way by using dplyr

library(dplyr)
df = data.frame(x = c("1990, 1991", "1997, 2004", "2005"), 
           y = c("1991, 1999", "1994", "1995, 2011"))
df

#>           x          y
#>1 1990, 1991 1991, 1999
#>2 1997, 2004       1994
#>3       2005 1995, 2011

df %>%
  rowwise() %>%
  mutate(z = paste(x, y, sep = ",") %>% 
               str_split(",") %>% 
               combine() %>% 
               min())

#>Source: local data frame [3 x 3]
#>Groups: <by row>

#>  # A tibble: 3 x 3
#>           x           y          z    
#>        <fct>       <fct>      <chr>
#>1  1990, 1991  1991, 1999       1990 
#>2  1997, 2004        1994       1994 
#>3        2005  1995, 2011       1995 
0
votes

Another base option:

df$result <- sapply(strsplit(gsub(",","",do.call(paste,df))," "),min)

df
#          one        two result
# 1 1990, 1991 1991, 1999   1990
# 2 1997, 2004       1994   1994
# 3       2005 1995, 2011   1995

result is character here (min works with characters), wrap as.numeric around the sapply call if that's an issue.