I am trying to do some analysis in a data-set (homicide rates in Brazil). Data is simple but I am learning, so not so simple for me anyway... After creating subsets grouping info by year, state and region, I can't still understand how to group these subsets into a bigger one (states by region). I would like to group all the regions on one bigger 'subset' so I can plot the information and instead of having the plot with data being shown by state, having it by region instead. It's probably simple and silly but I wasted a couple of hours googling and trying different codes, nothing works so far.
North <- subset(Homicides, State == 'AM' | State == 'RR'| State == 'AP' | State == 'PA' | State == 'TO' | State == 'RO' | State == 'AC')
Northeast <- subset(Homicides, State == 'MA' | State == 'PI'| State == 'CE' | State == 'RN' | State == 'PE' | State == 'PB' | State == 'SE' | State == 'AL' | State == 'BA')
Midwest <- subset(Homicides, State == 'MT' | State == 'MS'| State == 'GO'| State == 'DF')
Southeast <- subset(Homicides, State == 'SP' | State == 'RJ'| State == 'ES'| State == 'MG')
South <- subset(Homicides, State == 'PR' | State == 'RS'| State == 'SC')
AllRegions <- # How to group them so I can plot correctly?
And for the plot code:
ggplot(Homicides, aes(x = Year, y = TotalRate, group = State, color = State)) + # Where state should be the regions instead
geom_line() +
geom_point(size = 1) +
ggtitle("Total Homicides") +
theme_hc() +
scale_colour_hc()
How the dataset file looks like (for understanding)
State Year TotalRate FirearmsRate
1 AC 1979 34 13
2 AC 1980 26 12
3 AC 1981 28 8
4 AC 1982 41 18
5 AC 1983 33 12
6 AC 1984 36 13
dput(Homicides)
or if the output is really longdput(head(Homicides)
. I think you can solve this by creating a new column with some combination ofmutate
andcase_when
. – Ben Gggplot2::facet_wrap
might help here. If instead you are only interested in plotting the regions in one graph, I think you should first summarise the data by region; then you can plot some summary statistic by region. – Giovanni Colitti> dput(head(Homicides)) structure(list(State = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = c("AC", "AL", "AM", "AP", "BA", "CE", "DF", "ES", "GO", "MA", "MG", "MS", "MT", "PA", "PB", "PE", "PI", "PR", "RJ", "RN", "RO", "RR", "RS", "SC", "SE", "SP", "TO"), class = "factor"), Year = 1979:1984, TotalRate = c(34L, 26L, 28L, 41L, 33L, 36L), FirearmsRate = c(13L, 12L, 8L, 18L, 12L, 13L)), row.names = c(NA, 6L), class = "data.frame")
– Xamineh