TL;DR - play with width=
and position_dodge(width=...)
within your line for geom_violin
and geom_boxplot
to adjust the positions along with scale_x_discrete(expand=expansion(...)
.
The first point is that the resolution (and how close and far apart) things are on your plot will be related to the size of your window. With that being said, the positioning relationship of the plot elements between one another can be controlled via ggplot
. In particular, you want to change the values of width=
and position_dodge(width=...)
in your geom_violin
call (and your geom_boxplot
call).
Example Dataset
I'll use an example dataset to illustrate the idea, where I'll plot boxplots... but the idea is identical. The example dataset contains two x values ("Group1" and "Group2"), and each of those has subdivisions that are either "A", "B", or "C", containing a separate normal distribution of 50 datapoints for every x
and x.subdiv
.
set.seed(8675309)
df <- data.frame(
x=c(rep('Group1', 150), rep('Group2', 150)),
x.subdiv=rep(c(rep('A', 50), rep('B',50), rep('C',50)), 2),
y=unlist(lapply(1:6, function(x){rnorm(50, runif(1,10,15), runif(1,0,7))}))
)
Width of position_dodge
Here's the simple boxplot, where I'll use 0.5
as the value for both width=
and position_dodge(width=...)
. Note that the first argument in position_dodge
is width=
, so you can just supply that number directly to that function without explicitly assigning to the width
argument.
p <- ggplot(df, aes(x=x, y=y)) + theme_bw()
p + geom_boxplot(aes(fill=x.subdiv), width=0.5, position=position_dodge(0.5))
The rule to note here is:
geom_boxplot(width=...)
controls how wide the overall spread of box plots are around each x=
value.
position_dodge(width=...)
controls the amount of spread (the amount of "dodging") for the groups around the x=
aesthetic.
So this is what happens when you change position_dodge(width=1)
, but leave geom_boxplot(width=0.5)
:
p + geom_boxplot(aes(fill=x.subdiv), width=0.5, position=position_dodge(1))
The width of each box remains the same as before, but the positioning of each box around x=
is more "spread out". In effect, each is "dodged" more. If you set position_dodge(width=0.2)
, you'll see the opposite effect, where the boxes become squished together (because they are not spread out as much around x=
):
p + geom_boxplot(aes(fill=x.subdiv), width=0.5, position=position_dodge(0.2))
The interesting thing is how geom_boxplot(width=)
and position_dodge(width=)
are related:
If geom_boxplot(width=)
is equal to position_dodge(width=)
, the boxes will be touching
If geom_boxplot(width=)
is less than position_dodge(width=)
, the boxes will be separated from one another
If geom_boxplot(width=)
is greater than position_dodge(width=)
, the boxes will be overlapping one another
Width of the geom
The width=
of the geom itself relates to how wide the boxplots are. The point to keep in mind are these two points:
The width=
is the sum of all the widths of the individual dodged geoms for that particular x=
aesthetic.
width=1
is the width between two values on a discrete axis, meaning when you set width=1
, the boxes will be wide enough to touch
That means that if we set geom_boxplot(width=1)
, the combined total of all the boxes for "Group1" will be wide enough to touch the boxes of "Group2"... but you would only see that if there were no overlap among the boxes (meaning that position_dodge(width=)
would be equal to geom_boxplot(width=)
).
So this makes the boxes wide enough to be touching, but position_dodge(width)
is less than geom_boxplot(width)
... so the boxes overlap, but "Group1" boxes are separated from "Group2" boxes:
p + geom_boxplot(aes(fill=x.subdiv), width=1, position=position_dodge(0.8))
If we want everything to touch, you have to set them equal, and both equal to 1:
p + geom_boxplot(aes(fill=x.subdiv), width=1, position=position_dodge(1))
Control both widths
In the end, it's probably best to control both. If we go from the previous plot, you probably want the plots to have separation between "Group1" and "Group2". That means you need to make the width of all boxes smaller (which we control by geom_boxplot(width)
). However, you probably still want the dodging to leave a bit of space between the boxes, so we'll have to set position_dodge(width)
to be greater than geom_boxplot(width)
, but not too large so that we lose the separation between "Group1" and "Group2". Something like this works pretty well:
p + geom_boxplot(aes(fill=x.subdiv), width=0.5, position=position_dodge(0.55))
In your case, you have both geom_violin
and geom_boxplot
, so you'll need to adjust those together and work out the proper look.
EDIT: "Shift Left and Right" and "Squish"
If the width=
and position_dodge(width=
arguments are just not quite getting you what you need, there is another parameter that can work in concert with them to move things around. This would be to use scale_x_discrete(expand=...
to control the amount of space to the left and right of your x axis items. Used together with width=
and position_dodge(width=
, this actually gives you precise control of where to position your data along the x axis while still respecting the automated plotting that ggplot2
provides.
width=
controls the whitespace between data along the x axis
position_dodge(width=
controls the amount of whitespace between subgroups in the data positioned along the x axis
scale_x_discrete(expand=...
controls white space to the left and right sides of the panel.
I'll demonstrate the functionality using the same dataset as before. Note that proper use of the expand=
argument for scale_x_discrete
should call expansion()
and you will need to provide a 1 or 2 length vector to either add=
or mult=
. Play around with both and numbers to see the effect, but here's kind of what to expect.
The expansion()
function takes either mult=
or add=
as arguments, which can either be a vector of length 2 (where 1 is applied to left side and 2 is applied to the right side, or length 1 (where the number is applied to both sides). Numbers sent to mult=
are multiplied by the normal expansion to give you the new amount, so the code below sets the extra whitespace to the left and the right equal to 30% (0.3 * normal) of the typical expansion for both sides:
p + geom_boxplot(aes(fill=x.subdiv), width=0.5, position=position_dodge(0.55)) +
scale_x_discrete(expand=expansion(mult=0.3))
Sending two values, you can adjust separately. This sets the left side to be 100% (normal) and the right side to be reduced to 50% of normal:
p + geom_boxplot(aes(fill=x.subdiv), width=0.5, position=position_dodge(0.55)) +
scale_x_discrete(expand=expansion(mult=c(1,0.5)))
Bottom Line: Seems like by using all three arguments for width=
, position_dodge(width=
, and scale_x_discrete(expand=expansion(...
, you can theoretically place your x groupings anywhere along your plot. Just keep in mind that the resolution and aspect ratio of your graphics device will change how things are laid out a bit, so additional control can be adjusted by resizing the graphics window.