0
votes

I have data frame which consists four columns. There is column called status which has binary value: 0 or 1.

After grouping the data based on hour, I want to have stacked barplots representing the percentage of rows with 0 and 1 in the status column.

In SO I found the following related questions:

ggplot replace count with percentage in geom_bar

Show % instead of counts in charts of categorical variables

Create stacked barplot where each stack is scaled to sum to 100%

Creating a Stacked Percentage Bar Chart in R with ggplot

R stacked percentage bar plot with percentage of binary factor and labels (with ggplot)

and came up with this solution:

ggplot(df4, aes(x=hour, y=status, fill=as.factor(status)) ) +
  geom_bar(stat="identity") + 
  facet_grid(status ~ .) + 
  scale_x_continuous(breaks=seq(0,25,1))

However the resulting plot does not show any barplots for status values of 0 (and the y axis is not in percentage).

enter image description here

Why the 0 are not plotted? How to solve this?

The dataframe as csv: https://pastebin.com/Y7CfwPbf

Actually, the first linked question answers my problem, but I wonder whether it is possible to achieve this without having an intermediary step where we create a new dataframe.

2

2 Answers

0
votes

Is this something you are looking for?

enter image description here

See the article "How to plot a 'percentage plot' with ggplot2".

The code:

require(data.table)
require(ggplot2)

df4 <- fread("https://pastebin.com/raw/Y7CfwPbf")

ggplot(df4, aes(x = hour, y = 100 * ..prop.., fill = factor(status))) +
  geom_bar() + 
  facet_grid(status ~ .) + 
  scale_x_continuous(breaks = seq(0, 25, 1))
0
votes

perc can be created and used on the fly, as below:

  ggplot(df4 %>% group_by(status, hour) %>% 
         summarise (n = n()) %>% 
         mutate(perc = round(n / sum(n),3) * 100), 
       aes(x=hour, y=perc, fill=as.factor(perc))) +
  geom_bar(stat="identity") + 
  facet_grid(status ~ .) + 
  scale_x_continuous(breaks=seq(0,25,1)) 

enter image description here

If you wish to maintain same colors for same hour bars, then:

ggplot(df4 %>% group_by(status, hour) %>% 
           summarise (n = n()) %>% 
           mutate(perc = round(n / sum(n),3) * 100), 
       aes(x=hour, y=perc,fill=as.factor(hour))) +
    geom_bar(stat="identity") + 
    facet_grid(status ~ .) + 
    scale_x_continuous(breaks=seq(0,25,1)) 

enter image description here