2
votes

Hi I am trying to create a staked bar graph using ggplot2. I have a sample size of n=102, while I know the the bar will be extremely small I needed to show it as an example.

I have asked people what social network they use online and I got back 10 different social networks varying in use. I created a column for each variable: (Apologies I'm new to programming and had a great deal of trouble trying to create a table below)

Timestamp(ID) | Facebook | Twitter | Instagram | SnapChat | GooglePlus

--------------------|-------------|-----------|--------------|--------------|-----------------

12331           |1         |2       |3        |4      |  5      
  12312         |1         |2        |3         |NA       |  5 

12323           |1         |NA         |3         |4       |  5 

123234          |1         |2        |NA          |NA      |  5 

12324           |1         |NA       |NA        |NA     |NA   

The Timestamp in the image is the ID for each person and then I have created a column for each Social Network. I used the melt function from "reshape"package to get a variable that has all the social networks listed for each timestamp.

dfm<-melt(survey_results2[,c('Timestamp','Facebook', 'Twitter', 'Instagram',      'Snapchat', 'GooglePlus', 'LinkedIn', 'Pinterest', 'Tumblr', 'Quora', 'Drivetribe')], id.vars='Timestamp', na.rm=TRUE)

I then used ggplot to get a stacked bar graph, except I get no bars displayed when running the following:

ggplot(dfm, aes(x=Timestamp, y=value, fill=variable))+   geom_bar(position="stack", stat = "identity")
        labs(title="Social networks per person surveyed",
             x="ID of person surveyed",
             y="Social Networks Used")+
        theme(plot.title=element_text(size = rel(2.5)))

This is what I get What I Get

I want something like this: which is an answer from Roman Luštrik to a similar question from "making a stacked bar plot from multiple variables".

What I want

The dput is the following:

130201604529, 11302016114941, 12012016193618, 12012016195036, 
12012016203242, 12012016203826, 12012016204112, 12012016223032, 
11292016132850, 12012016193618), variable = structure(c(1L, 1L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 10L), .Label = c("Facebook", 
"Twitter", "Instagram", "Snapchat", "GooglePlus", "LinkedIn", 
"Pinterest", "Tumblr", "Quora", "Drivetribe"), class = "factor"), 
    value = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1
Can you include the output of dput(dfm) in your question, please? - J.Con
@J.Con I have updated with the dput. I am unsure what dput does can you explain what I am looking - Z. Young
dput() gives us something that can be put into r immediately, so we have the exact contents and structure of your dataframe. Unfortunately, you have left some of yours out. If you could put the exact output, your code will then be reproducible (see the hanging comma at the end with no bracket in your question?). - J.Con
Does making Timestamp a factor when plotting do what you want? - aosmith

1 Answers

1
votes

Simply set the x argument to a factor() as @aosmith suggests in comments. Specifically in the ggplot() call aes(x=factor(Timestamp). Right now, the graph treats it as a numeric figure from lowest Timestamp to highest Timestamp when you are aiming for category/indicator type. Also, you need to change the position argument n geom_bar() to fill for a stacked graph and use scales package for y-axis percentages:

library(ggplot2)
library(reshape2)
library(scales)

data = "Timestamp, Facebook, Twitter, Instagram, Snapchat, GooglePlus, LinkedIn, Pinterest, Tumblr, Quora, Drivetribe
12331, 1, 1, 1, 1, 1, 1, 1, 1, , 
12312, 1, 1, 1, , 1, , 1, 1, , 
12323, 1, , 1, 1, 1, 1, , , 1, 1
123234, 1, 1, , , 1, 1, , 1, 1, 1
12324, 1, , 1, , 1, , 1, , ,"

df = read.csv(text=data, header=TRUE)

dfm <- melt(df[c('Timestamp','Facebook', 'Twitter', 'Instagram',
                  'Snapchat', 'GooglePlus', 'LinkedIn', 'Pinterest', 
                  'Tumblr', 'Quora', 'Drivetribe')], 
            id.vars='Timestamp', na.rm=TRUE)

Unit Level Stacked Bar Graph

ggplot(dfm, aes(x=factor(Timestamp), y=value, fill=variable)) +
  geom_bar(position="stack", stat = "identity") +
  labs(title="Social networks per person surveyed",
       x="ID of person surveyed",
       y="Social Networks Used") +
  theme(plot.title=element_text(size = rel(2.5)))

Unit Level Stack Bar Graph

Percent Level Stacked Bar Graph

ggplot(dfm, aes(x=factor(Timestamp), y=value, fill=variable)) +
  geom_bar(position="fill", stat = "identity") +
  labs(title="Social networks per person surveyed",
     x="ID of person surveyed",
     y="Social Networks Used") +
  theme(plot.title=element_text(size = rel(2.5))) +
  scale_y_continuous(labels = percent)

Percent Level Stack Bar Graph