I have a basic understanding of R that mostly entails the ability to run regressions and summary statistics, so if there appear any gaps in my knowledge I would appreciate being pointed in the correct direction.
I have time series data in CSV that is formatted as follows:
Facility ID, Utility Type, Account No, Unit Name, Date 1, Date 2, Date 3, Date 4
There will be multiple rows for a specific account number referencing a unique utility type and facility (i.e., one row entry for Unit Name = L, one row entry for Unit Name = USD). The account number values for a particular unit at every date are entered in each "date" column. I would like to be able to write a script that enables me to re-export the data where each Date column doesn't contain entries for multiple units. I would also like to then designate to R that the Date columns represent monthly time series data points, and from there do various time series analysis.
I appreciate your help in telling me how to clean up this data.
As requested, sample data:
Facility ID, Facility Name, State, Utility Type, Supplier, Account No., Unit Name, 7/1/14, 8/1/14
4015, Palm Court Apts, CA, Chilled Water, PG&E, 87993, USD, 42333, 41775
4015, Palm Court Apts, CA, Chilled Water, PG&E, 87993, ton-hr, 244278, 238035
4044, 18 Sawtelle, CA, Natural Gas, Chevron, 17965, USD, 4860, 5890
4044, 18 Sawtelle, CA, Natural Gas, Chevron, 17965, M^3, 7639, 8895
Example output:
Facility ID, Facility Name, State, Utility Type, Supplier, Account No., Quantity Consumed, Unit of Measure, Utility Bill, Currency, Date
4015, Palm Court Apts, CA, Chilled Water, PG&E, 87993, 244278, ton-hr, 42333, USD, 7/1/14
4015, Palm Court Apts, CA, Chilled Water, PG&E, 87993, 238035, ton-hr, 41775, USD, 8/1/14
4044, 18 Sawtelle, CA, Natural Gas, Chevron, 17965, 7639, M^3, 4860, USD, 7/1/14
4044, 18 Sawtelle, CA, Natural Gas, Chevron, 17965, 8895, M^3, 5890, USD, 8/1/14

help(unstack)or , look into the reshape2 package for some ideas of how to proceed. - vpipkt