1
votes

I can plot multiple bar plots on one plot with following code (taken from this question):

mydata <- data.frame(Barplot1=rbinom(5,16,0.6), Barplot2=rbinom(5,16,0.25),
                     Barplot3=rbinom(5,5,0.25), Barplot4=rbinom(5,16,0.7))
barplot(as.matrix(mydata), main="Interesting", ylab="Total", beside=TRUE, 
        col=terrain.colors(5))
legend(13, 12, c("Label1","Label2","Label3","Label4","Label5"), cex=0.6, 
       fill=terrain.colors(5))

enter image description here

But my scenario is a bit different: I have data stored in 3 data.frames (sorted according to V2 column) where V1 column is the Y axis and V2 column is the X axis:

> tail(hist1)
   V1 V2
67  2 70
68  2 72
69  1 73
70  2 74
71  1 76
72  1 84
> tail(hist2)
   V1  V2
87  1  92
88  3  94
89  1  95
90  2  96
91  1 104
92  1 112
> tail(hist3)
    V1  V2
103  3 110
104  1 111
105  2 112
106  2 118
107  2 120
108  1 138

For plotting one single plot it is just simple as:

barplot(hist3$V1, main="plot title", names.arg = hist3$V2)

But I cannot construct the matrix needed for plot because of several problems that I can see right now (maybe there are several others):

My data has different size:

> nrow(hist1)
[1] 72
> nrow(hist2)
[1] 92
> nrow(hist3)
[1] 108

There are X (and therefore Y also) values which are in one list but not in another list e.g.:

> hist3$V2[which(hist3$V2==138)]
[1] 138
> hist1$V2[which(hist1$V2==138)]
integer(0)

What I need (I guess) is something that will create appropriate V2 (x axis) with 0 Y value in appropriate data.frame so they will have same length and I will be able combine them as above example. See following example with only 2 data.frames (v2 and v1 are reversed as in previous example):

> # missing v2 for 3,4,5
> df1
  v2     v1
1  1      1
2  2      2
3  6      3
4  7      4
5  8      5
6  9      6
7  10     7

> # missing v2 for 1,2,9,10
> df2
  v2     v1
1  3      1
2  4      2
3  5      3
4  6      4
5  7      5
6  8      6


> # some_magic_goes_here ...

> df1
  v2     v1
1  1      1
2  2      2
3  3      0 # created
4  4      0 # created
5  5      0 # created
6  6      3
7  7      4
8  8      5
9  9      6
10 10     7

> df2
  v2     v1
1  1      0 # created
2  2      0 # created
3  3      1
4  4      2
5  5      3
6  6      4
7  7      5
8  8      6
9  9      0 # created
10 10     0 # created

Thanks

1

1 Answers

1
votes

Probably, you can do this by 1) retrieving all possible x-axis values (v2 values) from all data.frames. and 2) using this information to retrieve existing values and/or filling missing ones with zeroes.

set.seed(111)
df1 <- data.frame(v2= sample(1:10, size = 7),
                  v1 = sample(1:100, size = 1))
df2 <- data.frame(v2= sample(1:10, size = 7),
                  v1 = sample(1:100, size = 1))
df3 <- data.frame(v2= sample(1:10, size = 7),
                  v1 = sample(1:100, size = 1))

First, retrieve your categories / x-axis values / v2 Note that if class(df1$v2) == "factor", then you should use levels() instead of unique() my.x <- unique(c(df1$v2, df2$v2, df3$v2))

Likely, you want it sorted

my.x <- sort(my.x)

Now, use my.x to re-order/fill your data.frames, starting with df1. Specifically, you check each value of my.x: if that value is included in df1$v2, then the corresponding v1 is returned, otherwise 0.

my.df1 <- data.frame(v2 = my.x, 
                      v1 = sapply(my.x, (function(i){
                        ifelse (i %in% df1$v2, df1$v1[df1$v2 == i], 0)
                      })))
my.df1

A simple way to apply this operation to all your data.frames is to list them together and then use lapply()

dfs <- list(df1 = df1, df2 = df2, df3 = df3)
dfs <- lapply(dfs, (function(df){
  data.frame(v2 = my.x, 
             v1 = sapply(my.x, (function(i){
               ifelse (i %in% df$v2, df$v1[df$v2 == i], 0)
             })))
}))
# show all data.frames    
dfs

# show df1
dfs$df1