0
votes

I am learning Shiny and wanted help on a app that I am creating. I am creating an app that will take dynamic inputs from the user and should generate bar and line charts. I managed to create the bar chart but it is generating incorrect result.

What I am looking for is variable selected in row should be my x-axis and y-axis should be percentage. scale to be 100%. column variable should be the variable for comparison and for that I am using position = "dodge". My data is big and I have created a sample data to depict the situation. Since actual data is in data.table format I am storing the sample data as data.table. Since I am not sure how I can include this data which is not in a file format, I create it first so that it is in R environment and then run the app -

    Location <- sample(1:5,100,replace = T)
    Brand <- sample(1:3,100,replace = T)
    Year <- rep(c("Year 2014","Year 2015"),50)
    Q1 <- sample(1:5,100,replace = T)
    Q2 <- sample(1:5,100,replace = T)

    mydata <- as.data.table(cbind(Location,Brand,Year,Q1,Q2))

Below is the Shiny code that I am using -

library("shiny")
library("ggplot2")
library("scales")
library("data.table")
library("plotly")

ui <- shinyUI(fluidPage(
  sidebarPanel(
    fluidRow(
      column(10,
             div(style = "font-size: 13px;", selectInput("rowvar", label = "Select Row Variable", ''))
      ),
      tags$br(),
      tags$br(),
      column(10,
             div(style = "font-size: 13px;", selectInput("columnvar", "Select Column Variable", ''))
      ))

  ),
  tabPanel("First Page"),
  mainPanel(tabsetPanel(id='charts',
                        tabPanel("charts",tags$b(tags$br("Graphical Output" )),tags$br(),plotlyOutput("plot1"))
  )
  )
))

server <- shinyServer(function(input, output,session){
  updateTabsetPanel(session = session
                    ,inputId = 'myTabs')


  observe({
    updateSelectInput(session, "rowvar", choices = (as.character(colnames(mydata))),selected = "mpg")
  })

  observe({
    updateSelectInput(session, "columnvar", choices = (as.character(colnames(mydata))),selected = "cyl")
  })

  output$plot1 <- renderPlotly({
    validate(need(input$rowvar,''),
             need(input$columnvar,''))
    ggplot(mydata, aes(x= get(input$rowvar))) + 
      geom_bar(aes(y = ..prop.., fill = get(input$columnvar)), position = "dodge", stat="count") +
      geom_text(aes( label = scales::percent(..prop..),
                     y= ..prop.. ), stat= "count", vjust = -.5) +
      labs(y = "Percent", fill=input$rowvar) +
      scale_y_continuous(labels=percent,limits = c(0,1))

  })

})

shinyApp(ui = ui, server = server)

If you see the problem is -

  1. All bars are 100%. Proportions are not getting calculated properly. Not sure where I am going wrong.

  2. If I try to use the group parameter it gives me error saying "input" variable not found. I tried giving group as group = get(input$columnvar)

  3. I believe I need to restructure my data for line chart. Can you help with how I can dynamically restructure the data.table and then re-use for the line chart. How can I generate the same bar chart as a line chart.

  4. I am using renderplotly so that I use the features of plotly to have the percentages displayed with the mouse movement / zoom etc. However I can see input$variable on mouse movement. How can I get rid of it and have proper names.

Have tried to detail out the situation. Do suggest some solution.

Thank you!!

1
I don't have a complete answer but for 1. maybe in your geom_bar use y = (..count..)/sum(..count..) instead of y = ..prop.. and for 4. in ggplot(mydata, aes(x= get(input$rowvar))) use instead ggplot(mydata, aes_string(x= input$rowvar)); and also replace in geom_text() with geom_text(aes( label = (..count..)/sum(..count..)*100, .......MLavoie
@MLavoie, thank you for suggesting, however when I replace the ..prop.. with (..count..)/sum(..count..) I start getting error as, object 'count' not found. Not sure if you too get the same error at your enduser1412
no I did not get this error. this is what I used geom_bar(aes(y = (..count..)/sum(..count..), fill = get(input$columnvar)), position = "dodge", stat="count")MLavoie
@MLavoie, It worked, I have been trying multiple things to sort this out and it seems had updated the code as stat = "identity" which was giving that error. I changed it back to stat = "count" and I am now not getting the error, however the plot generated does not show the correct percentages. I mean addition of all the bars for Year 2014 should be 100% and that of Year 2015 should 100%. Currently not sure how the bar heights are getting calculated. I am checking the bar heights with the table generated by prop.table(table(mydata$Brand,mydata$Year),2). Would you know how to correct this?user1412
100% is the sum of all barsMLavoie

1 Answers

1
votes

To properly group variables for plotting, geom_bar requires that the x values be numeric and the fill values be factors or that the argument group be used to explicitly specify grouping variables. However, plotly throws an error when group is used. The approach below converts x variables to integer and fill variables to factor so that they are properly grouped. This retains the use of geom_bar to calculate the percentages.

First, however, I wonder if mydata is specified correctly. Given that the data is a mix of character and integer, cbind(Location, Brand, Year, Q1, Q2) gives a character matrix which is then converted to a data.table where all variables are character mode. In the code below, I've defined mydata directly as a data.table but have converted Q1 to character mode so that mydata contains a mix of character and numeric.

The approach used below is to create a new data frame, plotdata, containing the x and fill data. The x data is converted to numeric, if necessary, by first making it a factor variable and then using unclass to get the factor integer codes. The fill data converted to a factor. plotdata is then used generate the ggplot plot which is then displayed using plotly. The code includes a couple of other modifications to improve the appearance of the chart.

EDIT

The code below has been updated to show the name of the row variable beneath it's bar. Also the percentage and count for each bar are only shown when the mouse pointer hovers above the bar.

 library("shiny")
  library("ggplot2")
  library("scales")
  library(plotly)
  library(data.table)

  Location <- sample(1:5,100,replace = T)
  Brand <- sample(1:3,100,replace = T)
  Year <- rep(c("Year 2014","Year 2015"),50)
  Q1 <- sample(1:5,100,replace = T)
  Q2 <- sample(1:5,100,replace = T)
  Q3 <- sample(seq(1,3,.5), 100, replace=T)
  mydata <- data.table(Location,Brand,Year,Q1,Q2, Q3)
#
# convert Q1 to character for demonstation purposes  
#
    mydata$Q1 <- as.character(mydata$Q1)

  ui <- shinyUI(fluidPage(
    sidebarPanel(
      fluidRow(
        column(10,
               div(style = "font-size: 13px;", selectInput("rowvar", label = "Select Row Variable", 
                                                           choices=colnames(mydata)))),
        tags$br(),
        tags$br(),
        column(10,
               div(style = "font-size: 13px;", selectInput("columnvar", label="Select Column Variable", 
                                                           choices=colnames(mydata))))
        )
    ),
    tabPanel("First Page"),
    mainPanel(tabsetPanel(id='charts',
                          tabPanel("charts",tags$b(tags$br("Graphical Output" )),tags$br(),plotlyOutput("plot1"))
    )
    )
  ))
  server <- shinyServer(function(input, output,session){
    updateTabsetPanel(session = session
                      ,inputId = 'myTabs')
    observe({
      updateSelectInput(session, "rowvar", choices = colnames(mydata), selected=colnames(mydata)[1])
    })
    observe({
      updateSelectInput(session, "columnvar", choices = colnames(mydata), selected=colnames(mydata)[2])
    })
    output$plot1 <- renderPlotly({
#
#   create data frame for plotting containing x variables as integer and fill variables as factors
#   
      if(is.numeric(get(input$rowvar)))  {
        rowvar_brks <- sort(unique(get(input$rowvar)))
        rowvar_lbls <- as.character(rowvar_brks)
        plotdata <- data.frame(get(input$rowvar), factor(get(input$columnvar)) )
      }
      else {
        rowvar_factors <- factor(get(input$rowvar))
        rowvar_brks <- 1:nlevels(rowvar_factors)
        rowvar_lbls <- levels(rowvar_factors)
        plotdata <- data.frame(unclass(rowvar_factors), factor(get(input$columnvar)) )
      }
      colnames(plotdata) <- c(input$rowvar, input$columnvar)
      validate(need(input$rowvar,''),
               need(input$columnvar,''))
      col_width <- .85*mean(diff(rowvar_brks))
      sp <- ggplot(plotdata, aes_(x = as.name(input$rowvar), fill = as.name(input$columnvar))) +
        geom_bar( aes(y= ..prop..), stat="count", position=position_dodge(width=col_width)) +
        geom_text(aes( label = paste(scales::percent(..prop..),"<br>", "count:",..count..,"<br>"),  y= ..prop.. + .01),
                  stat= "count", position=position_dodge(width=col_width), size=3, alpha=0) +
        labs(x= input$rowvar, y = "Percent", fill=input$columnvar) +
        scale_y_continuous(labels=percent) +
        scale_x_continuous(breaks=rowvar_brks, labels=rowvar_lbls)
        ggplotly(sp, tooltip="none")
      })
  })

  shinyApp(ui = ui, server = server)