0
votes

I've read through the ggplot2 docs website and other question but I couldn't find a solution. I'm trying to visualize some data for varying age groups. I have sort of managed to do it but it does not look like I would intend it to.

Here is the code for my plot

p <- ggplot(suggestion, aes(interaction(Age,variable), value, color = Age, fill = factor(variable), group = Age))
p + geom_bar(stat = "identity")+
  facet_grid(.~Age)![The facetting separates the age variables][1]

My ultimate goal is to created a stack bar graph, which is why I used the fill, but it does not put the TDX values in its corresponding Age group and Year. (Sometimes TDX values == DX values, but I want to visualize when they don't) Trying to fill TDX values into DX values

Here's the dput(suggestion)

    structure(list(Age = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 
3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 
5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 
7L), .Label = c("0-2", "3-9", "10-19", "20-39", "40-59", "60-64", 
"65+", "UNSP", "(all)"), class = "factor"), variable = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
8L, 8L, 8L, 8L, 8L, 8L, 8L), .Label = c("Year.10.DX", "Year.11.DX", 
"Year.12.DX", "Year.13.DX", "Year.10.TDX", "Year.11.TDX", "Year.12.TDX", 
"Year.13.TDX"), class = "factor"), value = c(26.8648932910636, 
30.487741796656, 31.9938838749782, 62.8189679326958, 72.8480838120064, 
69.3044125928752, 36.9789457527416, 21.808001825378, 24.1073451428435, 
40.3305134762935, 70.4486116545885, 68.8342676191755, 63.9227718107745, 
34.6086468618636, 8.84033719571875, 13.2807072303835, 28.4781516422802, 
55.139497471546, 59.7230544500003, 67.9448927372699, 37.7293286937066, 
6.9507024051526, 17.4393054963572, 33.1485743479821, 61.198647580693, 
58.6845873573852, 48.0073013177248, 28.4455801248562, 26.8648932910636, 
19.8044453272475, 23.0189084635948, 53.7037832071889, 60.6516550126422, 
58.1573725886767, 27.0791868812255, 21.808001825378, 19.8146296425633, 
35.0587750051557, 62.3308555053346, 59.3299998610862, 56.5341245769817, 
27.7229319271878, 8.84033719571875, 13.2807072303835, 22.4081606349585, 
48.0252683906252, 52.7560684009579, 65.2890977685045, 32.4142337849399, 
6.9507024051526, 15.2833655677215, 24.5268503180754, 52.536784326675, 
51.4100599515986, 40.9609231655724, 18.1306673637441)), row.names = c(NA, 
-56L), .Names = c("Age", "variable", "value"), class = "data.frame")
1
you can create stacked barplots with geom_bar(position = 'fill') but I don't know why you would want that after you have done all the stratifying/faceting here--you'd end up with each bar one color and pretty meaninglessrawr
Hey @rawr thanks for your comment. Yeah I think you're right about position = "fill" not being super helpful I'm trying to fill the values because its helps visualize if you get diagnosed with a disease - DX and of those # of DX how may get treated - TDXuser3900661

1 Answers

1
votes

It's unclear what you need but perhaps this.

ggplot(a,aes(x=variable,y=value,fill=Age)) + geom_bar(stat='identity')     
+facet_wrap(~Age)

enter image description here

If you want to visualize separately the TDX and the DX entries, we'll need to change the dataframe a bit.

> head(a)
Age   variable    value
1   0-2 Year.10.DX 26.86489
2   3-9 Year.10.DX 30.48774
3 10-19 Year.10.DX 31.99388
4 20-39 Year.10.DX 62.81897
5 40-59 Year.10.DX 72.84808
6 60-64 Year.10.DX 69.30441

The column of interest variable is a combination of year and of TDX/DX value. We'll use the tidyr package to separate this into two columns.

library(tidyr)
library(dplyr)
tidy_a<- a %>% separate(variable, into = c( 'nothing',"year",'label'), sep = "\\.")

This actually splits the levels of column variable into three components, since we split on . and the character . appears twice in each entry.

> head(tidy_a)
Age nothing year label    value
1   0-2    Year   10    DX 26.86489
2   3-9    Year   10    DX 30.48774
3 10-19    Year   10    DX 31.99388
4 20-39    Year   10    DX 62.81897
5 40-59    Year   10    DX 72.84808
6 60-64    Year   10    DX 69.30441

So the column nothing is rather useless, just a necessary result of using separate and separating on .. Now this will allow us to visualize TDX/DX separately.

ggplot(tidy_a,aes(x=year,y=value,fill=label)) + geom_bar(stat='identity') + facet_wrap(~Age)

enter image description here