1
votes

I would like to collapse multiple rows into one row by grouping multiple columns and not other columns. I have NA's in the columns not used for grouping. After attempting multiple solutions the resulting table is filled with NA's and no values. I am able to make the solutions work but only if I make is.na = 0. I would like not to introduce 0 into the dataframe because some measured results are zero.

This is a follow on to R collapse multiple rows into 1 row - same columns I attempted all the recommend solutions and the data results is NA

TreatName<-c('Static','Static','Dynamic', 'Static')
id<-c('patient1','patient1','patient2','patient2')
Method<-c('IV', 'IV', 'IV', 'IV')
drug1<-as.numeric(c(34,'','',''))
drug2<-as.numeric(c('',7,'',''))
drug3<-as.numeric(c('','',56, 0))
df<-data.frame(TreatName, id, Method, drug1, drug2, drug3)

library(plyr)
groupColumns = c("TreatName","id", "Method")
dataColumns = c( "drug1", "drug2","drug3")
df1<-ddply(df, groupColumns, function(x) colSums(x[dataColumns]))


The expected result should be 
TreatName   id  Method  drug1   drug2   drug3
Static  patient1    IV  34  7   NA
Dynamic patient2    IV  NA  NA  56
Static  patient2    IV  NA  NA  0

The actual results are
TreatName   id  Method  drug1   drug2   drug3
Dynamic patient2    IV  NA  NA  56
Static  patient1    IV  NA  NA  NA
Static  patient2    IV  NA  NA  0

I noticed if I change the na to zero
df[is.na(df)]<-0
then use the ddply function it works.  But now I introduced zero when no measurement was taken.

Open to any solutions
1

1 Answers

1
votes

Here is one option with dplyr

library(dplyr)
df %>% 
   group_by_at(groupColumns) %>%
   summarise_at(vars(dataColumns),  ~ if(all(is.na(.))) NA_real_ 
         else na.omit(.))
# A tibble: 3 x 6
# Groups:   TreatName, id [3]
#  TreatName id       Method drug1 drug2 drug3
#  <fct>     <fct>    <fct>  <dbl> <dbl> <dbl>
#1 Dynamic   patient2 IV        NA    NA    56
#2 Static    patient1 IV        34     7    NA
#3 Static    patient2 IV        NA    NA     0