I am trying to convert long format wind data into wide format. Both wind speed and wind direction are listed within the Parameter.Name column. These values need to be cast by both Local.Site.Name, and Date.Local variables.
If there are multiple observations per unique Local.Site.Name + Date.Local row, then I want the mean value of those observations. The built-in argument 'fun.aggregate = mean' works just fine for wind speed, but mean wind direction cannot be computed this way because the values are in degrees. For example, the average of two wind directions near North (350, 10) would output as South (180). For example: ((350 + 10)/2 = 180), despite the polar average being 360 or 0.
The 'circular' package will allow us to compute the mean wind direction without having to perform any trigonometry, but I am having trouble trying to nest this additional function within the 'fun.aggregate' argument. I thought a simple else if statement would do the trick, but I am running into the following error:
Error in vaggregate(.value = value, .group = overall, .fun = fun.aggregate, : could not find function ".fun"
In addition: Warning messages:
1: In if (wind$Parameter.Name == "Wind Direction - Resultant") { :
the condition has length > 1 and only the first element will be used
2: In if (wind$Parameter.Name == "Wind Speed - Resultant") { :
the condition has length > 1 and only the first element will be used
3: In mean.default(wind$"Wind Speed - Resultant") :
argument is not numeric or logical: returning NA
The goal is to be able to use the fun.aggregate = mean
for Wind Speed, but the mean(circular(Wind Direction, units = 'degrees')
for Wind Direction.
Here's the original data (>100MB): https://drive.google.com/open?id=0By6o_bZ8CGwuUUhGdk9ONTgtT0E
Here's a subset of the data (1st 100 rows): https://drive.google.com/open?id=0By6o_bZ8CGwucVZGT0pBQlFzT2M
Here's my script:
library(reshape2)
library(dplyr)
library(circular)
#read in the long format data:
wind <- read.csv("<INSERT_FILE_PATH_HERE>", header = TRUE)
#cast into wide format:
wind.w <- dcast(wind,
Local.Site.Name + Date.Local ~ Parameter.Name,
value.var = "Arithmetic.Mean",
fun.aggregate = (
if (wind$Parameter.Name == "Wind Direction - Resultant") {
mean(circular(wind$"Wind Direction - Resultant", units = 'degrees'))
}
else if (wind$Parameter.Name == "Wind Speed - Resultant") {
mean(wind$"Wind Speed - Resultant")
}),
na.rm = TRUE)
Any help would be greatly appreciated!
-spacedSparking
EDIT: HERE'S THE SOLUTION:
library(reshape2)
library(SDMTools)
library(dplyr)
#read in the EPA wind data:
#This data is publicly accessible, and can be found here: https://aqsdr1.epa.gov/aqsweb/aqstmp/airdata/download_files.html
wind <- read.csv("daily_WIND_2016.csv", sep = ',', header = TRUE, stringsAsFactors = FALSE)
#convert long format wind speed data by date and site id:
wind_speed <- dcast(wind,
Local.Site.Name + Date.Local ~ Parameter.Name,
value.var = "Arithmetic.Mean",
fun.aggregate = function(x) {
mean(x, na.rm=TRUE)
},
subset = .(Parameter.Name == "Wind Speed - Resultant")
)
#convert long format wind direction data into wide format by date and local site id:
wind_direction <- dcast(wind,
Local.Site.Name + Date.Local ~ Parameter.Name,
value.var = "Arithmetic.Mean",
fun.aggregate = function(x) {
if(length(x) > 0)
circular.averaging(x, deg = TRUE)
else
-1
},
subset= .(Parameter.Name == "Wind Direction - Resultant")
)
#join the wide format split wind_speed and wind_direction dataframes
wind.w <- merge(wind_speed, wind_direction)