I am new to data analysis with R. I recently got a pre-formatted environmental observation-model dataset, an example subset of which is shown below:
date site obs mod site obs mod
2000-09-01 00:00:00 campus NA 61.63 city centre 66 56.69
2000-09-01 01:00:00 campus 52 62.55 city centre NA 54.75
2000-09-01 02:00:00 campus 52 63.52 city centre 56 54.65
Basically, the data include the time series of hourly observed and modelled concentrations of a pollutant at various sites in "reoccurring columns", i.e., site - obs - mod (in the example I only showed 2 out of the total 75 sites). I read this "wide" dataset in as a data frame, and wanted to reshape it into the "narrower" format as:
date site obs mod
2000-09-01 00:00:00 campus NA 61.63
2000-09-01 01:00:00 campus 52 62.55
2000-09-01 02:00:00 campus 52 63.52
2000-09-01 00:00:00 city centre 66 56.69
2000-09-01 01:00:00 city centre NA 54.75
2000-09-01 02:00:00 city centre 56 54.65
I believed that I should use the package "reshape2" to do this. Firstly I tried to melt and then dcast the dataset:
test.melt <- melt(test.data, id.vars = "date", measure.vars = c("site", "obs", "mod"))
However, it only returned half of the data, i.e., records of the site(s) ("city centre") following the first one ("campus") were all cut off:
date variable value
2001-01-01 00:00:00 site campus
2001-01-01 01:00:00 site campus
2001-01-01 02:00:00 site campus
2001-01-01 00:00:00 obs NA
2001-01-01 01:00:00 obs 52
2001-01-01 02:00:00 obs 52
2001-01-01 00:00:00 mod 61.63
2001-01-01 01:00:00 mod 62.55
2001-01-01 02:00:00 mod 63.52
I then tried recast:
test.recast <- recast(test.data, date ~ site + obs + mod)
However, it returned with error message:
Error in eval(expr, envir, enclos) : object 'site' not found
I have tried to search for previous questions but have not found similar scenarios (correct me if I am wrong). Could someone please help me with this?
Many thanks in advance!
reshape2
language, fully "molten". See my updated answer for both options. – A5C1D2H2I1M1N2O1R2T1