0
votes

I have a data frame DF in which I want to insert new column called Stage by comparing with the data frame DF1 columns Col1,Col2,Col3,Col4,Col5,Col6. Below is my sample data format

Col1=c("ABCD","","","","wxyz","")
Col2=c("","","MTNL","","","")
Col3=c("","PQRS","","","","")
Col4=c("","","","","","")
Col5=c("","","","","","")
Col6=c("","","","","","EFGH")

DF=data.frame(Col1,Col2,Col3,Col4,Col5,Col6)

Style=c("ABCD","WXYZ","PQRS","EFGH")
DF1=data.frame(Style)

Stage=c(1,1,3,6)
DFR=data.frame(Style,Stage)

DFR would be my resulting data frame. Can Some one help me to solve this.

3

3 Answers

1
votes

A tidyverse method:

library(tidyverse)

DFR <- DF %>% 
  mutate(across(everything(), ~na_if(., ""))) %>% 
  pivot_longer(cols = everything(),
               names_to = "Stage",
               values_to = "Style",
               values_drop_na = T) %>% 
  filter(Style %in% c("ABCD","WXYZ","PQRS","EFGH"))%>%
  mutate(Stage = as.integer(gsub("Col", "", Stage)))

The first mutate call replaces your blank values with NA. Then I pivot your table to long format and drop NA values, before filtering for only the Style values you're interested in (these can be saved in a vector instead to make the code cleaner, but here the column and your vector are named the same so I didn't want to make it confusing). The second mutate call is optional, it removes "Col" from each of your Stage values and converts the column to the type integer.

1
votes

You can join the data after getting it into long format.

library(dplyr)
library(tidyr)

DF %>%
  pivot_longer(cols = everything()) %>%
  right_join(DF1, by = c('value' = 'Style'))

#  name  value
#  <chr> <chr>
#1 Col1  ABCD 
#2 Col3  PQRS 
#3 Col6  EFGH 
#4 NA    WXYZ 
0
votes

I tried to solve this by below way and it is working

DF <- DF %>% 
  mutate(across(everything(), ~na_if(., "")))

DFR=DF1
DFR$Stage=ifelse(is.na(DF1$Style),NA,ifelse(DF1$Style %in% DF$Col1,1,
                                            ifelse(DF1$Style %in% DF$Col2,2,
                                                   ifelse(DF1$Style %in% DF$Col3,3,
                                                          ifelse(DF1$Style %in% DF$Col4,4,
                                                                 ifelse(DF1$Style %in% DF$Col5,5,
                                                                        ifelse(DF1$Style %in% DF$Col6,6,NA)))))))