0
votes

I created a bar graph in ggplot using stat = "count" and position = "fill" to show the proportional occurrence of each feature per year (below). I find the readability of this graph rather poor and therefore I'd like to split the graph into facets. However, if I add facet_wrap(~Features), it just fills the bars in every separate facet. How can I prevent this from happening?

The code for my original graph is:

data %>% ggplot(aes(x = Year, fill = Features)) + geom_bar(stat = "count", position = "fill") + theme_classic() + theme(axis.text.x = element_text(angle = 90)) + scale_y_continuous(labels = scales::percent)

I've tried:

data %>% ggplot(aes(x = Year)) + stat_count(geom = "bar", aes(y = ..prop..)) + facet_wrap(~Features) + theme_classic() + theme(axis.text.x = element_text(angle = 90))

but this calculates the proportion within the facet rather than within each year.

Any ideas how I can solve this (using ggplot, rather than by restructuring my data)?

enter image description here

A little about my data:

I have a data frame of features (factor) with for each feature the year (factor) in which this feature was observed. The same feature can occur several times per year, so there are several rows with the same entry for year and feature.

1
And while I'm about it, I would not want to show the facet for "other" although it should be used in the calculation for the proportionRemco de Grave
Can you tell us a bit more about what you want the ultimate solution to look like? If you want to split the plot into two groups, you need to make a variable in your data frame that identifies the two groups, then you could facet on them. If that causes problems with the proportion calculation, you could always calculate them ahead of time and then provide them in the y aesthetic and use stat="identity"DaveArmstrong
Thanks Dave. I want to have a facet for each feature, where the height of the bars are exactly the high that it has in the graph above. In other words, have a facet for each colour in the graph above, while hiding the other colours (and the bars starting at y=0 rather than floating in the middle of the facet) .Remco de Grave

1 Answers

0
votes

This should work. First, I'll make some data that has similar properties:

labs <- c("Digital labels", "Produce ID (barcode)", 
          "Smart labels", "Product Recommendation", 
          "Shopping list", "Product Browsing", 
          "Product ID (computer vision)", 
          "Navigation (in-store)", "Product ID (RFID)", 
          "Other")

years <- vector(mode="list", length=13)
years[[1]] <- c(1,2)
years[[2]] <- c(1,2,8)
years[[3]] <- c(1,2,4,10)
years[[4]] <- c(1,2,3,4,5,6,8,9,10)
years[[5]] <- c(2,3,4,5,8,10)
years[[6]] <- c(1:6, 10)
years[[7]] <- c(1:6, 10)
years[[8]] <- 1:10
years[[9]] <- c(1,3,6,9,10)
years[[10]] <- c(1:5, 7,9,10)
years[[11]] <- 1:10
years[[12]] <- c(1:6, 8:10)
years[[13]] <-  c(1,2,3,6,8,9,10)
y <- 2008:2020

dat <- NULL
for(i in 1:13){
  tmp <- tibble(
    Features = sample(years[[i]], runif(1,600,1000), replace=TRUE), 
    Year = y[i]
  ) %>% 
    mutate(Features = factor(Features, levels=1:10, labels=labs))
  dat <- rbind(dat, tmp)
}

Next, here's the original plot like the one you made initially.

dat %>% 
  ggplot(aes(x = Year, fill = Features)) + 
  geom_bar(stat = "count", position = "fill") + 
  theme_classic() + 
  theme(axis.text.x = element_text(angle = 90)) + 
  scale_y_continuous(labels = scales::percent)

enter image description here

And here's how that would translate into different facets. The key is to make the percentages by hand first and then plot them directly.

agdat %>% filter(Features != "Other") %>% 
ggplot(aes(x=Year, y=pct)) + 
  geom_bar(stat="identity") + 
  facet_wrap(~Features, ncol=3) + 
  labs(x="Year", y="Percent") + 
  theme_classic() + 
  theme(axis.text.x = element_text(angle = 90)) + 
  scale_y_continuous(labels = scales::percent)

enter image description here