2
votes

I'm trying to convert my data frame format using R. I want unique company name as it has multiple observations for each company. My data looks like

company name    Values  Year
    A              1    2010
    A              2    2011
    B              4    2010
    B              6    2012
    C              8    2011

I want below format

 company name   first_value First_year  second_values second_year
     A              1          2010          2           2011
     B              4          2010          6           2012  
     C              8          2011          NA           NA

I have tried this code but it is not giving result what I am expecting

library(plyr)
extract.hashtags <- function(x) {
x <- subset(x,select=c(-Company.Name))
mat <- as.matrix(x)
dim(mat) <- c(1,length(mat))
as.data.frame(mat)
}

df1 = ddply(data, .(Company.Name), extract.hashtags )
2

2 Answers

6
votes

You can use reshape in base R after you add a "time" variable, which can be done with getanID from my "splitstackshape" package:

reshape(getanID(mydf, "companyname"), idvar = "companyname", 
          timevar = ".id", direction = "wide")
#    companyname Values.1 Year.1 Values.2 Year.2
# 1:           A        1   2010        2   2011
# 2:           B        4   2010        6   2012
# 3:           C        8   2011       NA     NA
6
votes

A similar solution using the devel version of data.table (v 1.9.5+)

library(data.table) ## v 1.9.5+
dcast(setDT(df)[, indx := 1:.N, by = company_name], 
      company_name ~ indx, value.var = c("Values", "Year"))

# c   ompany_name Values_1 Values_2 Year_1 Year_2
# 1:            A        1        2   2010   2011
# 2:            B        4        6   2010   2012
# 3:            C        8       NA   2011     NA

The idea is to add a counter per group and then reshape from long to wide according to that counter while specifying two variables as the explained vars simultaneously (currently available only in the devel version).