0
votes

Using ggplot, I want to create a plot in which I have, using the same y-axis:

  • geom_smooth line (on which the axis should be adjusted)

  • geom_histogram (distribution of the x-variable)

I have added geom_histogram to the plot but this changes my ylim. However, instead of it changing the ylim, I would like to have it adjusted so it fits within the ylim of the plot if it were to only use geom_smooth. (in this case, it would be ylim = c(0, 110) I guess)

set.seed(1)
age <- as.integer(runif(10000, 18, 80))
y <- rnorm(10000, 100, 10)
y2 <-  rnorm(10000, 50, 5)

 data <- data.frame(age, y, y2)

plot_data <- data %>% select(age, y, y2) %>% gather("type", "value", 2:3)

g <- ggplot(plot_data, mapping = aes_string(x = 'age', y = 'value', 
color='type')) + 
  geom_smooth() + 
  scale_x_continuous(labels = scales::comma) + 
  geom_histogram(inherit.aes=F, mapping = aes_string(x='age'), alpha=0.5)
# which would have show the count of the variable of the x-axis (age here) and would have max(count) = max(value)

g
1

1 Answers

0
votes

You can use ..count.. to pull out the number of points in each bin and scale that by the maximum value of value. Y would become count * max(value)/max(count).

ggplot(data = plot_data, aes(x = age)) + 
  geom_smooth(aes(y = value, color = type)) +
  scale_x_continuous(labels = scales::comma) +
  geom_histogram(aes(y =..count.. * (max(plot_data$value) / max(..count..))), alpha=0.5)