I have a dataframe called 'madrid' that has the measurements of air pollutants along with the date, year, month, and day. There's data from 2001 to 2018
data.frame': 3808224 obs. of 15 variables:
$ date : Date, format: "2001-08-01" "2001-08-01" "2001-08-01" ...
$ BEN : num NA 1.5 NA NA NA ...
$ CO : num 0.37 0.34 0.28 0.47 0.39 ...
$ EBE : num NA 1.49 NA NA NA ...
$ NMHC : num NA 0.07 NA NA NA ...
$ NO_2 : num 58.4 56.2 50.7 69.8 22.8 ...
$ O_3 : num 34.5 42.2 46.3 40.7 66.3 ...
$ PM10 : num 105 100.6 100.1 69.8 75.2 ...
$ SO_2 : num 6.34 8.11 7.85 6.46 8.8 ...
$ TCH : num NA 1.24 NA NA NA ...
$ TOL : num NA 10.8 NA NA NA ...
$ station: int 28079001 28079035 28079003 28079004 28079039 28079006 28079007 28079009 28079038 28079011 ...
$ year : num 2001 2001 2001 2001 2001 ...
$ month : num 8 8 8 8 8 8 8 8 8 8 ...
$ day : int 1 1 1 1 1 1 1 1 1 1 ...
and I'm now trying to create a plot that shows the monthly total average of each pollutant. So, ideally one line/curve with the months plotted on the x-axis to see if there's seasonality. So far, I've done this (below) with dplyr:
madrid_season <- madrid %>%
group_by(month, year) %>%
summarise(BEN_mean = mean(BEN, na.rm = TRUE),
CO_mean = mean(CO, na.rm = TRUE),
EBE_mean = mean(EBE, na.rm = TRUE),
NMHC_mean = mean(NMHC, na.rm = TRUE),
NO_2_mean = mean(NO_2, na.rm = TRUE),
O_3_mean = mean(O_3, na.rm = TRUE),
PM10_mean = mean(PM10, na.rm = TRUE),
SO_2_mean = mean(SO_2, na.rm = TRUE),
TCH_mean = mean(TCH, na.rm = TRUE),
TOL_mean = mean(TOL, na.rm = TRUE))
and then I sum the averages (excluding the month and year column):
madrid_season$tot <- rowSums(madrid_season[,2:11], na.rm = TRUE)
and then I try and plot:
p4 <- ggplot(madrid_season, aes(x=month,y=tot))+geom_point()+geom_line()
except this plots the total monthly average for every year, instead of for the single month for each year, so I end up with a wacky looking plot.
I think I need help using dplyr in this case (or open to other suggestions!)
Date
class variable on the x-axis, like the first day of each month. Add in a column that'sas.Date(paste(year, month, "01", sep = "-"))
and put that on the x-axis. – Gregor Thomasstr()
,head()
or screenshot)? You can use thereprex
anddatapasta
packages to assist you with that. See also Help me Help you & How to make a great R reproducible example? – Tung