13
votes

I have a matrix with the following entries:

dput(MilDis[1:200,])
structure(list(hhDomMil = c("HED", "ETB", "HED", "ETB", "PER", 
"BUM", "EXP", "TRA", "TRA", "PMA", "MAT", "MAT", "KON", "ETB", 
"PMA", "PMA", "HED", "BUM", "BUM", "HED", "PMA", "PMA", "HED", 
"TRA", "BUM", "EXP", "BUM", "PMA", "ETB", "MAT", "ETB", "ETB", 
"KON", "MAT", "TRA", "BUM", "BUM", "TRA", "TRA", "PMA", "PMA", 
"PMA", "MAT", "ETB", "TRA", "BUM", "TRA", "MAT", "BUM", "ETB", 
"TRA", "TRA", "BUM", "KON", "ETB", "ETB", "ETB", "BUM", "KON", 
"ETB", "ETB", "PMA", "TRA", "PER", "PER", "MAT", "HED", "KON", 
"TRA", "TRA", "TRA", "EXP", "TRA", "BUM", "MAT", "MAT", "TRA", 
"PMA", "HED", "PER", "TRA", "PER", "EXP", "PER", "BUM", "KON", 
"BUM", "ETB", "ETB", "TRA", "PER", "ETB", "KON", "KON", "BUM", 
"ETB", "BUM", "MAT", "BUM", "KON", "KON", "ETB", "MAT", "KON", 
"PER", "ETB", "ETB", "KON", "PMA", "PER", "HED", "HED", "PMA", 
"MAT", "PMA", "PER", "PMA", "TRA", "TRA", "MAT", "BUM", "BUM", 
"KON", "ETB", "ETB", "ETB", "PMA", "TRA", "TRA", "PMA", "PER", 
"KON", "PER", "BUM", "KON", "ETB", "ETB", "BUM", "TRA", "ETB", 
"PMA", "HED", "MAT", "TRA", "BUM", "PMA", "BUM", "ETB", "TRA", 
"TRA", "TRA", "PER", "EXP", "HED", "BUM", "EXP", "HED", "BUM", 
"MAT", "DDR", "BUM", "MAT", "KON", "HED", "HED", "TRA", "BUM", 
"PMA", "PMA", "PMA", "KON", "KON", "MAT", "ETB", "MAT", "TRA", 
"MAT", "ETB", "ETB", "TRA", "MAT", "ETB", "TRA", "HED", "BUM", 
"MAT", "TRA", "PMA", "BUM", "BUM", "EXP", "ETB", "EXP", "EXP", 
"MAT", "TRA", "KON", "BUM", "BUM", "HED"), kclust = c(1L, 2L, 
15L, 4L, 5L, 6L, 5L, 7L, 8L, 5L, 6L, 5L, 11L, 6L, 5L, 1L, 9L, 
10L, 2L, 1L, 9L, 8L, 4L, 11L, 14L, 5L, 8L, 11L, 12L, 5L, 5L, 
14L, 15L, 2L, 10L, 6L, 8L, 4L, 6L, 8L, 14L, 14L, 16L, 10L, 5L, 
1L, 12L, 17L, 12L, 16L, 16L, 5L, 10L, 14L, 8L, 19L, 5L, 4L, 4L, 
14L, 2L, 14L, 9L, 7L, 1L, 14L, 4L, 15L, 18L, 16L, 9L, 14L, 6L, 
14L, 12L, 11L, 4L, 7L, 8L, 12L, 9L, 16L, 2L, 6L, 15L, 1L, 1L, 
3L, 14L, 5L, 5L, 9L, 14L, 6L, 5L, 14L, 15L, 2L, 14L, 2L, 1L, 
8L, 5L, 10L, 1L, 1L, 16L, 5L, 2L, 9L, 9L, 1L, 12L, 10L, 1L, 4L, 
1L, 9L, 8L, 8L, 5L, 10L, 1L, 10L, 2L, 6L, 15L, 2L, 2L, 10L, 5L, 
6L, 10L, 19L, 19L, 6L, 5L, 6L, 7L, 7L, 8L, 5L, 16L, 5L, 6L, 6L, 
1L, 10L, 12L, 4L, 7L, 19L, 7L, 8L, 16L, 10L, 5L, 16L, 12L, 7L, 
7L, 19L, 4L, 6L, 1L, 15L, 7L, 8L, 16L, 4L, 10L, 15L, 11L, 10L, 
1L, 10L, 17L, 1L, 2L, 1L, 14L, 8L, 8L, 14L, 10L, 8L, 6L, 6L, 
8L, 5L, 7L, 5L, 1L, 5L, 7L, 9L, 2L, 1L, 9L, 14L), order = c(9, 
1, 9, 1, 3, 7, 10, 5, 5, 2, 8, 8, 4, 1, 2, 2, 9, 7, 7, 9, 2, 
2, 9, 5, 7, 10, 7, 2, 1, 8, 1, 1, 4, 8, 5, 7, 7, 5, 5, 2, 2, 
2, 8, 1, 5, 7, 5, 8, 7, 1, 5, 5, 7, 4, 1, 1, 1, 7, 4, 1, 1, 2, 
5, 3, 3, 8, 9, 4, 5, 5, 5, 10, 5, 7, 8, 8, 5, 2, 9, 3, 5, 3, 
10, 3, 7, 4, 7, 1, 1, 5, 3, 1, 4, 4, 7, 1, 7, 8, 7, 4, 4, 1, 
8, 4, 3, 1, 1, 4, 2, 3, 9, 9, 2, 8, 2, 3, 2, 5, 5, 8, 7, 7, 4, 
1, 1, 1, 2, 5, 5, 2, 3, 4, 3, 7, 4, 1, 1, 7, 5, 1, 2, 9, 8, 5, 
7, 2, 7, 1, 5, 5, 5, 3, 10, 9, 7, 10, 9, 7, 8, 6, 7, 8, 4, 9, 
9, 5, 7, 2, 2, 2, 4, 4, 8, 1, 8, 5, 8, 1, 1, 5, 8, 1, 5, 9, 7, 
8, 5, 2, 7, 7, 10, 1, 10, 10, 8, 5, 4, 7, 7, 9)), .Names = c("hhDomMil", 
"kclust", "order"), row.names = c(NA, 200L), class = "data.frame")

I want to create a stacked bar plot like this one Barplot.

The only problem is, that I would like to have the order of the stacks to fit this (ETB,PMA,PER,KON,TRA,DDR,BUM,MAT,HED,EXP) - the order numbers in the matrix and I have also some aesthetic problems. I searched for a solution here but none of the ordering suggestions worked for me... :-\

  1. How do I plot such a ordered plot?
  2. How do I set up x so that each bar is "on" one number?
  3. How do I seperate the bars - here I tried that with a white border...?
  4. How do I print all kclust numbers in x?

Thanks a lot for your help! Dominik


UPDATE

Here is the code I used to draw my plot:

mycols <- c('#FFFD00', '#97CB00', '#3168FF', '#FF0200', '#FB02FE', \
'#CCFCCC', '#FE9900', '#98CBF8', '#00CCFF', '#00FD03') # Set milieu colors


ggplot(MilDis) +
 geom_bar(aes(kclust, fill=factor(hhDomMil), \
 colour=mycols), position='fill', binwidth=1, colour='white') +
 scale_fill_manual(values = mycols)

UPDATE 2:

That's how I did it now:

    mycols <- c('#3168FF', '#00CCFF', '#98CBF8', '#CCFCCC', '#00FD03',\
   '#97CB00', '#FFFD00', '#FE9900', '#FB02FE', '#FF0200') # Set milieu colors

    ggplot(MilDis) +
      geom_bar(aes(factor(kclust), fill=reorder(hhDomMil,order)),\
      position='fill') +
      scale_fill_manual(values = mycols)

With this result:

Image

Thank you all for your help!

3
Can you post the ggplot code you used to get the plot shown here? It would save a little bit of time in getting up to speed to make the modifications (other than ordering, which @Gavin Simpson has dealt with below) that you are requesting ... - Ben Bolker
You should ask 1 question per Question - it makes it easier to search and find Answers. - Gavin Simpson
@Ben: I just updated my post. - Dominik
@Gavin You're right, but splitting it up would made it also more complicated... - Dominik
@Dominik ??? Why? I've Answered 1 and didn't even need the plotting code. 2,3, & 4 Just need kclust coercing to a factor - at the moment you are using a continuous variable and hence continuous scale for the x-axis. - Gavin Simpson

3 Answers

12
votes

I see that you have an order column in your data frame which I gather is your order. Hence you can simply do.

p0 = qplot(factor(kclust), fill = reorder(hhDomMil, order), position = 'fill', 
       data = df1)

Here are the elements of this code that take care of your questions

  1. How do I plot such a ordered plot? reorder
  2. How do I set up x so that each bar is "on" one number? factor(kclust)
  3. How do I seperate the bars?
  4. How do I print all kclust numbers in x? factor(kclust)

I remember from a previous question of yours that the hhDomMil corresponded to different groups, and I suspect your ordering follows the grouping. In that case, you might want to use that information to choose a color palette that makes it simpler to follow the graph. Here is one way to do it.

mycols = c(brewer.pal(3, 'Oranges'), brewer.pal(3, 'Greens'), 
           brewer.pal(2, 'Blues'), brewer.pal(2, 'PuRd'))

p0 + scale_fill_manual(values = mycols)

enter image description here

12
votes

The answer to this is easily solved by getting your data formatted correctly before passing it to ggplot(). The key is to explicitly set the levels of the hhDomMil factor. Assuming your data are in dat:

dat <- transform(dat, hhDomMil = factor(hhDomMil,
                                        levels = c("ETB", "PMA", "PER", "KON",
                                                   "TRA", "DDR", "BUM", "MAT",
                                                   "HED", "EXP")))

That fixes hhDomMil as a factor in place inside dat, and sets the levels to be in the order you wanted:

> head(dat$hhDomMil)
[1] HED ETB HED ETB PER BUM
Levels: ETB PMA PER KON TRA DDR BUM MAT HED EXP

Notice what is happing when R coerces hhDomMil to a factor:

> head(factor(as.character(dat$hhDomMil)))
[1] HED ETB HED ETB PER BUM
Levels: BUM DDR ETB EXP HED KON MAT PER PMA TRA

The default is to sort the levels alphabetically, which is why the plot is coming out as you show.

The best advice I can give, is to get your data correctly formatted first and only then try to plot it - don't rely on automatic or on-the-fly conversion to get this right for you; inevitably it won't be what you want.

7
votes

If you relevel your hhDomMil as a factor like this:

o<-c("ETB" "PMA" "PER" "KON" "TRA" "DDR" "BUM" "MAT" "HED" "EXP")
d$hh<-factor(d$hhDomMil,levels=o)

then your plot will be in the order you like:

ggplot(d,(aes(x=kclust, fill=hh))) +geom_bar(position="fill")