Mannat here is an answer using data.table
package to help you aggregate. Use install.packages(data.table)
to first get it into your R.
library(data.table)
# For others
# I copied your data into a csv file, Mannat you will not need this step,
# other helpers look at data in DATA section below
final_data <- as.data.table(read.csv(file.path(mypath, "SOaccidents.csv"),
header = TRUE,
stringsAsFactors = FALSE))
# For Mannat
# Mannat you will need to convert your existing data.frame to data.table
final_data <- as.data.table(final_data)
# check data formats, dates are strings
# and field is Date not DATE
str(final_data)
final_data$Date <- as.Date(final_data$Date, "%m/%d/%Y")
# use data table to aggregate on months
# First lets add a field plot date with Year and Month YYYYMM 201401
final_data[, PlotDate := as.numeric(format(Date, "%Y%m"))]
# key by this plot date
setkeyv(final_data, "PlotDate")
# second we aggregate with by , and label columns
plotdata <- final_data[, .(Cyclists.monthly = sum(Cyclists.injured),
Motorists.monthly = sum(Motorists.injured)), by = PlotDate]
# PlotDate Cyclists.monthly Motorists.monthly
#1: 201401 2 8
# You can then plot this (makes more sense with more data)
# for example, for cyclists
plot(plotdata$PlotDate, plotdata$Cyclists.monthly)
Mannat if you are not familiar with data.table
, please see the cheatsheet
DATA
For others looking to work on this. Here is result from dput:
final_data <- data.table(Date = c("01/01/2014", "01/01/2014", "01/01/2014",
"01/01/2014", "1/19/2014", "1/19/2014"),
Time = c("12:05", "12:34","06:05", "08:01", "12:05", "12:56"),
Location = c("Bronx", "Bronx","Bronx", "Bronx",
"Manhattan", "Manhattan"),
Cyclists.injured = c(0L, 1L, 0L, 1L, 0L, 0L),
Motorists.injured = c(1L, 2L, 0L, 2L, 1L, 2L))
PLOTS
Either use ggplot2
package
or for plots please see Plot multiple lines (data series) each with unique color in R for plotting help.
# I do not have your full data so one point line charts not working
# I needed another month for testing, so added a fake February
testfeb <- data.table(PlotDate = 201402, Cyclists.monthly = 4,
Motorists.monthly = 10)
plotdata <- rbindlist(list(plotdata, testfeb))
# PlotDate Cyclists.monthly Motorists.monthly
#1 201401 2 8
#2 201402 4 10
# Plot code, modify the limits as you see fit
plot(1, type = "n",
xlim = c(201401,201412),
ylim = c(0, max(plotdata$Motorists.monthly)),
ylab = 'monthly accidents',
xlab = 'months')
lines(plotdata$PlotDate, plotdata$Motorists.monthly, col = "blue")
lines(plotdata$PlotDate, plotdata$Cyclists.monthly, col = "red")
# to add legend
legend(x = "topright", legend = c("Motorists","Cyclists"),
lty=c(1,1,1), lwd=c(2.5,2.5,2.5),
col=c("blue", "red"))
# or set legend inset x to another position e.g. "bottom" or "bottomleft"
%Y
instead of%y
. – David Arenburgas.Date("1/1/2014" , "%m/%d/%Y")
works just fine. – David Arenburgdput(head(final_data))
(3) The question asks for a pedestrain time series but is no pedestrian data in your data frame. (4) are you looking to sum each numeric column by Date and then plot the sums against Date ignoring the Time and Location columns? – G. Grothendieck