0
votes

I have searched for this answer but I could not find it. I'd like to show only the first X elements of a ggplot barplot, by coding only into ggplot itself. Let me explain a bit further. I have a dataset listing the number of residents in a city, for each year (1999-2018), gender, citizenship and quarter/area of the city. I am selecting one year and one citizenship, and then plot the number of residents (y axis, stacked bars for gender male/female) for each quarter, sorted by number of residents.

This is an example: Plotting residents for each quarter, "Afghanistan", year 2018

I simply like to cut down to the first X (e.g. 10) bars. I have to insert the code into a Shinyapp, so I'm trying to insert the instructions directly among the code in the "manipulate" function here below. Is there a way to do it in the aes/reorder functions (see code below)?

This is the code, where I tried (without success) to use the "subset" function to cut on no. of residents (by quarter, but grouping the genders, i.e. I want to retain a quarter where the Females are 8 and the Males are 7).

Many thanks!

p10 <- manipulate(
  ggplot(subset(df_tothab_STR_citt[df_tothab_STR_citt$Y==YearList &
                              df_tothab_STR_citt$Citt==CittList,]), 
         aes(x = reorder(Nil, Residenti, FUN = sum),
             y = Residenti,
             group = Gen,
             fill = Gen)) + 
    geom_bar(stat = "identity") +
    scale_fill_brewer(palette = "Set1") +
    ggtitle(paste(CittList, "-", YearList)) +
    coord_flip() +
    labs(fill="Gender", x="Nil", y="Resident") +
    theme_bw(),
  CittList = picker(
    as.list(unique(as.character(df_tothab_STR_citt$Citt)))),
  YearList = picker(
    as.list(unique(as.character(sort(df_tothab_STR_citt$Y)))))
)

This is also the "head" and "str" of the dataset.

head(df_tothab_STR_citt)
     Y                    Nil     Gen        Citt Residenti
1 2018                 Baggio Femmine Afghanistan         2
2 1999             Bande Nere Femmine Afghanistan         1
3 2000             Bande Nere Femmine Afghanistan         1
4 2001             Bande Nere Femmine Afghanistan         1
5 2002             Bande Nere Femmine Afghanistan         1
6 2014 Buenos Aires - Venezia Femmine Afghanistan         1
str(df_tothab_STR_citt)
'data.frame':   196703 obs. of  5 variables:
 $ Y        : int  2018 1999 2000 2001 2002 2014 2016 2017 2018 2012 ...
 $ Nil      : chr  "Baggio" "Bande Nere" "Bande Nere" "Bande Nere" ...
 $ Gen      : Factor w/ 2 levels "Femmine","Maschi": 1 1 1 1 1 1 1 1 1 1 ...
 $ Citt     : chr  "Afghanistan" "Afghanistan" "Afghanistan" "Afghanistan" ...
 $ Residenti: int  2 1 1 1 1 1 1 1 1 3 ...

-------- Editing ------------- For the sake of completeness, I post here what I've done in the Shiny app to get this working for one of the charts I'm plotting in the app.

Basically, the "reactive" function calculates df13_react based on input$something, that comes from sliders or select inputs. Then the renderPlot/ggplot filters the result of the calculations based only on a ratio range (again, from min/max sliders between 0 and 100%)

What I was doing with the manipulate function was to reproduce the same behavior outside of the ShinyApp, only for testing it without having to re-load the app everytime.

Best regards.

  # PLOT13:
  output$t13 <- renderText({
    "Citizens (ITA/STR) by Nil, gender (M/F) and year (selectable)"
  })

  df13_react <- reactive({
    df13 <- df_tothab[df_tothab$Y==input$YearList,]
    # df13Nil <- unique(df_tothab$Nil[df_tothab$Residenti>input$minRes13])

    df13agg <- aggregate(df13$Residenti,
                         by=list(Nil=df13$Nil, Cittadinanza=df13$Citt),
                         FUN=sum)
    colnames(df13agg)[3]<-"Residenti"
    df13agg <- dcast(df13agg, Nil ~ Cittadinanza, value.var="Residenti")
    df13agg[is.na(df13agg)] <- 0
    df13agg$Ratio <- df13agg$STR / (df13agg$STR+df13agg$ITA)
    df13agg[is.na(df13agg)] <- 0
    df13$Ratio <- NA

    addRatio <- function (df13, df13agg){
      vRatio <- rep(0, length(df13$Ratio))
      for(i in 1:length(df13$Ratio)){
        vRatio[i] <- df13agg$Ratio[df13agg$Nil==df13$Nil[i]]
      }
      vRatio
    }
    df13$Ratio <- addRatio(df13, df13agg)
    return(df13)
  })

  output$p13 <- renderPlot({
    df13 <- df13_react()
    ggplot(df13[df13$Ratio >= input$minRatio13 / 100 &
                  df13$Ratio <= input$maxRatio13 / 100,],
           aes(x = reorder(Nil, Ratio),
               y = Residenti,
               group = interaction(Gen, Cittadinanza),
               fill = interaction(Gen, Cittadinanza))) + 
      geom_bar(stat = "identity", position = "fill") +
      scale_fill_manual(values=c("red","green",
                                 "blue","turquoise4")) +
      ggtitle(paste(input$YearList)) +
      labs(fill="Legend", x=NULL, y="Resident") +
      scale_y_continuous(labels = scales::percent_format(accuracy = 5L),
                         breaks = c(seq(0, 1, 0.1))) +
      coord_flip() +
      theme(panel.ontop = TRUE,
            panel.background = element_blank(),
            panel.grid = element_line(colour = "black"))
  })
1

1 Answers

0
votes

Looks like the code you have already generates the plot you want, so you are just looking for a way to cut down the number of observations to be the "top X", correct?

The simple answer to this might be to first (1) sort your dataframe (by "Residenti" from the look of your data), and then (2) plot the subset of that dataframe via head(df, x).

So something like:

sorted.df <- your.df[order(Residenti, decreasing = TRUE),]
ggplot(data=subset(head(sorted.df, top.X)), aes(...)) + ...