1
votes

I am using ggplot2 to make a bar chart of the number of participants per year by gender. If I have 14 years included, I would like 2 bars for each year corresponding to the number of males and females for that year. I am not getting each year along the x-axis. I think data is being binned. I have tried changing the bin width, using scale_x_date and am still stuck. Can you help me figure out how to have the data for EACH year in my graph?

As an example, here is my data for years 2004-2017:

year=c(2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016,2017)
gender=c("male" , "female")

Participants is by gender, male then female respectively per year:

Participants=c(1307,443,1847,630,2109,765, 1824,691,2250,952,3123,1421,4097,1904,6415,3284,8788,4678,11581,6694,13141,8478,16389,10575,20990,13811,26951,19729)
data=data.frame(year,gender,Participants)

Here is how I am trying to generate my plot:

MyPlot <- ggplot(data, aes(fill=gender, y=Participants, x=year)) + 
                 geom_bar(position="dodge", stat="identity",width = .8)

print(MyPlot + ggtitle("Annual Number of Participants by Gender"))

On the x-axis, the years 2006, 2010, 2014 and 2018 are marked and the bars correspond to data from two years. I want data for each year, both in terms of the bars and in terms of the ticks on the x-axis.

Any help would be appreciated!

1

1 Answers

1
votes

You have more participants than years, so you don't have a clear dataframe design to serve as an input to ggplot.

Start here: Read this: https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html

The key to which is:

Each variable forms a column.
Each observation forms a row.
Each type of observational unit forms a table.

Then once you have a tibble/data frame your ggplot2 code should work fine. I'd kill the width= option until you have it working.