My dataset contains 200 companies over 8 years and I have got CO2 Emissions as a variable. I want to see if CO2 levels are decreaasing over time. I run something like cor(CO2, years)
but then the correlation is very weak because the panel structure (that I have different companies) is not regarded.
I tried using a panel regression with only CO2 and Years, but its not working either. Do you have any idea how to compute this kind of thing in R?
Calculating a correlation inside each company-panel and then fit all values together to give me one correlation coefficient at the end?
So what is the goal? Correlation or whether they are decreasing over time? Also it may be useful to post a sample of your data, right now it is unclear where the issue is.
– user2974951
1 Answers
I didn't understand your data perfectly, but here is my best guess at an answer.
I think you have data in a long format with columns like "year", "company", "co2"
, and you would want to know the correlation per company.
Let's generate some example data:
n_years <- 10
n_companies <- 200
# Generate some CO2 data
co2 <- vapply(seq_len(n_companies), function(i) {
}, as.numeric(years))
colnames(co2) <- paste0("company",
# Shaping into long format
starting_data <- reshape2::melt(co2)
colnames(starting_data) <- c("year", "company", "co2")
year company co2
1 1 company1 2.076313
2 2 company1 3.481235
3 3 company1 5.089682
4 4 company1 5.237323
5 5 company1 3.199387
6 6 company1 1.600289
We would like to go back to a wide-format to easily calculate correlations, where column names are companies and rows are years.
wide <- reshape2::dcast(starting_data, years ~ company, value.var = "co2")[,-1]
wide[1:5, 1:5]
company1 company2 company3 company4 company5
1 2.076313 0.5128075 1.203343 -0.6344231 -3.458794
2 3.481235 2.0916749 1.764760 -1.6445168 -3.967761
3 5.089682 1.2900221 1.498875 -2.8475682 -4.185798
4 5.237323 2.2348157 1.104034 -2.9786654 -5.780707
5 3.199387 2.7052902 2.711285 -4.2117059 -6.623060
Then we could easily calculate the correlation per company by looping over the columns:
cors <- apply(wide, 2, function(x) {
company1 company2 company3 company4 company5 company6
-0.1482829 -0.5765154 0.8813647 -0.9065915 -0.7263349 -0.7794206