0
votes

I am using geom_histogram in R to produce a histogram using the code:

ggGender <- ggplot(dfGenderGrouped, aes(log(freq), fill=dfGenderGrouped$name) ) + 
geom_histogram(data=dfGenderGrouped, binwidth = 1, alpha=0.5, color="black") + theme_bw() + 
theme(axis.title = element_text(size=16), legend.text = element_text(size=12), axis.text.y = element_text(size=12, angle=45), axis.text.x = element_text(size=12), legend.position=c(0.8,0.7)) + ylab("Number of patients") + 
xlab("Events (log)")+labs(fill="Events") + scale_y_continuous(labels = comma) + 
scale_fill_brewer(palette="Spectral")

enter image description here

The dfGenderGrouped data frame looks like:

  patid freq              name Group
1  1156    1 Male - All events   All
2  1194    1 Male - All events   All
3  1299    1 Male - All events   All
4  1445    1 Male - All events   All
5  1476    2 Male - All events   All
6  2045    2 Male - All events   All

The unique values to name are presented in the legend. The unique values to Group are:

> unique(dfGenderGrouped$Group)
[1] "All"      "Clinical" "Referral" "Therapy"

I would like to organise the stacks by the Group value e.g., in bin 0 you have a stacked column of Female - All events and Male - All events and then the same stacked column in binn 1 etc. For further clarification, I would then like Female - Clinical events and Male - Clinical events as a single stacked column also across the bins. Thus, each column of stacked values has the Group value in common (All, Clinical, Referral, and Therapy).

Further clarification, bin 0 would have the following column stacks (organised by Group in the data.frame):

Female - All events & Male - All events
Female - Clinical events & Male - Clinical events
Female - Referral events & Male - Referral events
Female - Therapy events & Male - Therapy events

Then for bin 1 the same:

Female - All events & Male - All events
Female - Clinical events & Male - Clinical events
Female - Referral events & Male - Referral events
Female - Therapy events & Male - Therapy events

Help is much appreciated.

1
The data may not be sufficient to make a plot, please make it more reproducible.NelsonGon
I think you're wanting to both stack and dodge the same histogram, which isn't really possible, though there are hacks to achieve the effect : see stackoverflow.com/questions/12715635/…Allan Cameron

1 Answers

2
votes

What about facetting your graph using "Group` column such as:

library(ggplot2)
ggplot(data = df, aes(log(Freq), fill = Name))+
    geom_histogram(binwidth = 1, alpha = 0.5,color = "black")+
    facet_wrap(.~Group,nrow = 1, scales = "fixed")+
    labs(x = "Events (log)", y = "Number of patients", fill="Events") + 
    scale_fill_brewer(palette="Spectral")

enter image description here

EDIT: Simplify the legend

To simplify the legend, you can just plot Male and Female using facet_wrap, yo ujsut need to edit your "Name" column in order to remove all the right part of the string and keep only Male / Female denomination

df$Name <- sub("-.*","",df$Name))
ggplot(data = df, aes(log(Freq), fill = Name))+
  geom_histogram(binwidth = 1, alpha = 0.5,color = "black")+
  facet_wrap(.~Group,nrow = 1, scales = "fixed")+
  labs(x = "Events (log)", y = "Number of patients", fill="Events") + 
  scale_fill_brewer(palette="Spectral")

enter image description here

Alternative using grid.arrange

Alternatively, you can create 4 plots and arrange them on a single figure using grid.arrange function from gridExtra package. Like that, youwill have a legend for each plot:

library(gridExtra)
ALL <- ggplot(data = subset(df, Group == "ALL"), aes(log(Freq), fill = Name))+
  geom_histogram(binwidth = 1, alpha = 0.5,color = "black")+
  labs(x = "Events (log)", y = "Number of patients", fill="Events", title = "ALL") + 
  scale_fill_brewer(palette="Spectral")+
  scale_x_continuous(limits = c(4,9), breaks = 4:9)+
  theme_bw()+
  theme(legend.position=c(0.3,0.7),
        legend.text = element_text(size=8),
        legend.title = element_text(size = 8))

Clin <- ggplot(data = subset(df, Group == "Clin"), aes(log(Freq), fill = Name))+
  geom_histogram(binwidth = 1, alpha = 0.5,color = "black")+
  labs(x = "Events (log)", y = "Number of patients", fill="Events", title = "Clinical") + 
  scale_fill_brewer(palette="Spectral")+
  scale_x_continuous(limits = c(4,9), breaks = 4:9)+
  theme_bw()+
  theme(legend.position=c(0.3,0.7),
        legend.text = element_text(size=8),
        legend.title = element_text(size = 8))

Ref <- ggplot(data = subset(df, Group == "Ref"), aes(log(Freq), fill = Name))+
  geom_histogram(binwidth = 1, alpha = 0.5,color = "black")+
  labs(x = "Events (log)", y = "Number of patients", fill="Events", title = "Ref") + 
  scale_fill_brewer(palette="Spectral")+
  scale_x_continuous(limits = c(4,9), breaks = 4:9)+
  theme_bw()+
  theme(legend.position=c(0.3,0.7),
        legend.text = element_text(size=8),
        legend.title = element_text(size = 8))

Ther <- ggplot(data = subset(df, Group == "Ther"), aes(log(Freq), fill = Name))+
  geom_histogram(binwidth = 1, alpha = 0.5,color = "black")+
  labs(x = "Events (log)", y = "Number of patients", fill="Events", title = "Ther") + 
  scale_fill_brewer(palette="Spectral")+
  scale_x_continuous(limits = c(4,9), breaks = 4:9)+
  theme_bw()+
  theme(legend.position=c(0.3,0.7),
        legend.text = element_text(size=8),
        legend.title = element_text(size = 8))

grid.arrange(nrow = 1, ALL, Clin, Ref, Ther)

enter image description here

Does it look what you are trying to achieve ? If not, can you clarify your question ?


NB: Please take a look to my code to learn how to properly make a ggplot2 graph, for example once you have declared the dataframe using data =, you don't need anymore $ to design column names.


Reproducible example:

df <- data.frame(Group = rep(c("ALL","Clin","Ref","Ther"),each = 50),
                   Name = rep(rep(c("M","F"), each = 25),4),
                   Freq = sample(1:10000,200, replace = TRUE),
                   Patient = sample(1000:5000,200,replace = TRUE))
  df$Name = paste(df$Name,df$Group,sep = " - ")