1
votes

I currently have a data frame that has structure as below:

  Establishment.date Species  Shade.Tol         Ele    Kipuka
1                1980  PSEMEN Intolerant Under 1050m On Kipuka
2                1981  PINCON Intolerant Above 1050m On Kipuka
3                1981  ABIPRO Intolerant Under 1050m On Kipuka
4                1981  ABIPRO Intolerant Under 1050m On Kipuka
5                1981  ABILAS   Tolerant Above 1050m On Kipuka
6                1982  ABILAS   Tolerant Above 1050m On Kipuka
7                1983  PSEMEN Intolerant Under 1050m On Kipuka
8                1984  TSUHET   Tolerant Under 1050m On Kipuka
9                1984  TSUHET   Tolerant Under 1050m On Kipuka
10               1984  PSEMEN Intolerant Under 1050m On Kipuka
11               1984  PINCON Intolerant Under 1050m On Kipuka
12               1984  ABIPRO Intolerant Above 1050m On Kipuka
13               1984  ABIPRO Intolerant Above 1050m On Kipuka

I am trying to make a bar plot to highlight the number of establishments that occurred at both high and low elevations faceted by their shade tolerance and showing the count of each class as a label. My current approach is filtering the data frame to have a new summarized data frame as below as below:

# A tibble: 9 x 4
# Groups:   Establishment.date, Shade.Tol [7]
  Establishment.date Shade.Tol  Ele         count
               <int> <fct>      <fct>       <int>
1               1980 Intolerant Under 1050m     1
2               1981 Intolerant Above 1050m     1
3               1981 Intolerant Under 1050m     2
4               1981 Tolerant   Above 1050m     1
5               1982 Tolerant   Above 1050m     1
6               1983 Intolerant Under 1050m     1
7               1984 Intolerant Above 1050m     2
8               1984 Intolerant Under 1050m     2
9               1984 Tolerant   Under 1050m     2

and plotting that new information into ggplot as follows:

cores_clean %>%
  group_by(Establishment.date,Shade.Tol,Ele) %>%
  summarise(count = n()) %>%
ggplot(aes(x = Ele, y=count, label=count)) +
  geom_bar(stat = "identity",position = "dodge") +
  geom_text(aes(label=count),size = 3)+
  facet_wrap(~ Shade.Tol)+
  #scale_fill_grey()+
  theme_bw() + 
  labs(x = "Elevation Range",
       y = "Count",
       title = "Establishments")+
  theme(plot.title = element_text(hjust = 0.5))

But when I run the code, the graphic output prints a stacked line of values as below,

enter image description here

which do not represent those found in the data frame (n=740). I tried adding geom_text(aes(label=sum(count))) but that printed the same positioning of numbers with the total number of observations repeated multiple times. Not sure if I am filtering the data wrong or not adding it to ggplot correctly.

1
please include your data, e.g. posting the output of cores_clean %>% group_by(Establishment.date,Shade.Tol,Ele) %>% summarise(count = n()) %>% dput - Roman
@Roman I posted the output as an image where it says "new summarized data frame". I am not sure how to upload the csv file here if that's what you meant - k3r0
Images of datasets are not as helpful. Input the code from @Roman into the console and it should give you an output that starts with structure(.... Copy and paste that output directly into your question as code. It will allow for others to copy and paste that code in order to recreate your precise dataframe. - chemdork123

1 Answers

1
votes

Your main issue is that you group_by(Estabilishment.date), but you don't seem to even want that in your graph. Here is one option that uses stat_summary to calculate the sums:

cores_clean %>%
  group_by(Establishment.date,Shade.Tol,Ele) %>%
  dplyr::summarise(count = n()) %>%
ggplot(aes(x = Ele, y=count, fill = as.factor(Establishment.date))) +
  geom_bar(stat = "identity") +
  stat_summary(geom = "text", aes(label = ..y.., group = Ele),
               fun = sum, vjust = -0.1) + 
  facet_wrap(~ Shade.Tol) +
  theme_bw() + 
  labs(x = "Elevation Range", y = "Count",
       title = "Establishments", fill = "Year")+
  theme(plot.title = element_text(hjust = 0.5))

enter image description here

Alternatively, you could remove Estabilishment.date from your group_by and do this:

cores_clean %>%
  group_by(Shade.Tol,Ele) %>%
  dplyr::summarise(count = n()) %>%
ggplot(aes(x = Ele, y=count)) +
  geom_bar(stat = "identity") +
  stat_summary(geom = "text", aes(label = ..y.., group = Ele),
               fun = sum, vjust = -0.1) + 
  facet_wrap(~ Shade.Tol) +
  theme_bw() + 
  labs(x = "Elevation Range", y = "Count",
       title = "Establishments")+
  theme(plot.title = element_text(hjust = 0.5))

enter image description here