4
votes

I want to show change in job numbers within certain time period. Ideally, I'd like to use a ggplot2 geom_dotplot and then color those dots by the column that they are in for that month. One idea I have not tried yet: do I need to reformat my data using tidyr from a wide to a long format in order to plot this?

Example data

Month       Finance       Tech        Construction     Manufacturing
Jan         14,000        6,800       11,000           17,500
Feb         11,500        8,400       9,480            15,000
Mar         15,250        4,200       7,200            12,400
Apr         12,000        6,400       10,300           8,500

My current r code attempt: I know that I need to fill the dot color by a factor of industry type. Maybe I have to have the data in a long format to do so.

library(tidyverse)
g <- ggplot(dat, aes(x = Month)) +
  geom_dotplot(stackgroups = TRUE, binwidth = 1000, binpositions = "all") +
  theme_light()
g

Here's how the plot I'm trying to make could look. Ideally I'd like to bin the dots as one dot per 1000 in the column value. Is that possible?

enter image description here

Thank you for taking the time to help someone who is new to R and is studying in school. Much appreciated as always,

1

1 Answers

4
votes

I could not get the geom_dotplot to work, the y-axis always comes out wrong. Try something like, first pivot long and we repeat the Month+category per every 1000, note this solution below rounds up:

library(dplyr)
library(tidyr)
library(ggplot2)

test = pivot_longer(dat,-Month,names_to="category") %>% 
group_by(Month,category) %>% 
summarize(bins=ceiling(value/ 1000)) %>% 
uncount(bins)

If you would prefer to round down to the nearest 1000, use floor() instead of ceiling() .

Then plot:

test$Month = factor(test$Month,levels=dat[,1])

test %>% ggplot(aes(x=Month,y=1,col=category)) + 
geom_point(position=position_stack()) + 
scale_y_continuous(labels=scales::number_format(scale=1000))

enter image description here