0
votes

Normally I would order e.g. bars (geom_col) with the reorder function in mapping via aes.

This time I am trying to plot two different datasets, with different geoms. Both datasets have 10 factors. But only 7-8 of theese are overlapping. So the final plot will have 12-13 different x axis categories.

I can only create such a plot, where the x-axis is sorted alphabetically. I would however prefer an ordering like reorder(x, y, mean).

I have tried with different datasets specified in each geom. I have tried with the different datasets specified in each geom + and generic dataset with all factorlevels in the main ggplot call mapping). I have tried with one dataset, where each geom is subset. I have tried with specifying the breaks in the scale_x_discrete call.

But no luck.

My plot looks like this right now:

sorted alphabetically, both geoms present

But I would like the x axis sorted like this:

correct sort, but missing a geom

Reproducible code below:

library(data.table)
library(ggplot2)

ach <- data.table(structure(list(navn = c("Site 2", "Site 2", "Site 2", "Site 2", 
"Site 2", "Site 2", "Site 2", "Site 2", "Site 2", "Site 2", "Site 1", 
"Site 1", "Site 1", "Site 1", "Site 1", "Site 1", "Site 1", "Site 1", 
"Site 1", "Site 1", "Site 1", "Site 1", "Site 1", "Site 1", "Site 1", 
"Site 1", "Site 1", "Site 1", "Site 1", "Site 1", "Site 2", "Site 2", 
"Site 2", "Site 2", "Site 2", "Site 2", "Site 2", "Site 2", "Site 2", 
"Site 2"), label = structure(c(3L, 1L, 5L, 2L, 4L, 12L, 11L, 
8L, 13L, 6L, 1L, 2L, 3L, 4L, 5L, 8L, 6L, 13L, 7L, 12L, 1L, 2L, 
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 3L, 1L, 2L, 5L, 4L, 10L, 11L, 
12L, 6L, 8L), .Label = c("Chrome", "Safari", "Mobile Safari", 
"Microsoft Edge", "Chrome Mobile", "Firefox", "Chrome Webview", 
"Internet Explorer", "Facebook", "Google Search App", "Chrome Mobile iOS", 
"Samsung Browser", "Unknown"), class = c("ordered", "factor")), 
    pct = c(0.37077667377162, 0.284011127192169, 0.114574093209446, 
    0.107776254140023, 0.0273750053538491, 0.0197337026421917, 
    0.0170135341174811, 0.0168099157004604, 0.0163568871020231, 
    0.0132327870134832, 0.477916324161792, 0.16184883132943, 
    0.149806240099312, 0.0666580391386212, 0.0362168449289707, 
    0.0293411843838547, 0.023760052468152, 0.0142741652666905, 
    0.0119379425103071, 0.00782828343346858, 0.521873820606366, 
    0.159862875833438, 0.127547490250346, 0.0637973329978614, 
    0.033313938860234, 0.0236350484337653, 0.0124308089067807, 
    0.0105044659705623, 0.0187130456661215, 0.00656529123160146, 
    0.338447084607149, 0.325075588616413, 0.126214081751673, 
    0.0918676900891937, 0.0368556480116706, 0.016716050669163, 
    0.0157698274108945, 0.0154219264971084, 0.0127328373084234, 
    0.00817483113360235), graf = c("All time", "All time", "All time", 
    "All time", "All time", "All time", "All time", "All time", 
    "All time", "All time", "All time", "All time", "All time", 
    "All time", "All time", "All time", "All time", "All time", 
    "All time", "All time", "Last month", "Last month", "Last month", 
    "Last month", "Last month", "Last month", "Last month", "Last month", 
    "Last month", "Last month", "Last month", "Last month", "Last month", 
    "Last month", "Last month", "Last month", "Last month", "Last month", 
    "Last month", "Last month")), row.names = c(NA, -40L), class = "data.frame", index = structure(integer(0), "`__graf`" = integer(0))))

ggplot()+
  geom_col(data=ach[graf=='All time'], aes(x=label, y=pct, fill=navn),
           position="dodge2")+
  geom_point(data=ach[graf=='Last month'], aes(x=label, y=pct, fill=navn, color="1"),
             stroke=2,shape=21, size=4,  position=position_dodge2(.9), show.legend = TRUE)+
  scale_fill_manual(name="Alle time")+
  scale_color_discrete(values='black',
                     name="Last month", labels="")+
  scale_y_continuous(labels=scales::percent)+
  labs(title="Top 10 browsers", x="", y="Pct pageviews")+
  theme_bw()+
  theme(legend.position = "bottom", axis.text.x = element_text(angle=90, hjust=1, vjust=0.5))

EDIT: Stefan solved the problem with tidyr. If you are using data.table like me, another approach is to use dcast() to make the data table wide.

In my case however, I started out with two data.tables (all_time and last_month). Instead of rbind to produce ach, it would be more efficent just to mearge them with merge(all_time, last_month, all=TRUE, suffixes("all_time", "last_month"). This creates the wide data.table with all the empty factors.

1

1 Answers

1
votes

As far as I get it you could achieve your desired result by converting your data to wide format using e.g. tidyr::pivot_wider:


library(tidyr)
library(ggplot2)

ach_wide <- ach %>% 
  pivot_wider(names_from = "graf", values_from = "pct")

ggplot(data = ach_wide, aes(label, fill = navn)) +
  geom_col(aes(y = `All time`),
    position = "dodge2"
  ) +
  geom_point(aes(y = `Last month`, color = "1"),
    stroke = 2, shape = 21, size = 4, position = position_dodge2(.9)
  ) +
  scale_y_continuous(labels = scales::percent) +
  labs(title = "Top 10 browsers", x = "", y = "Pct pageviews", fill = "All time", color = "Last month") +
  theme_bw() +
  theme(legend.position = "bottom", axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))
#> Warning: Removed 3 rows containing missing values (geom_col).
#> Warning: Removed 3 rows containing missing values (geom_point).