2
votes

Many hours of manic googling and leafing through ggplot2 documentation having brought me no closer, I was hoping someone could maybe nudge me in the right direction.

I have cell count data for a few thousand subjects in a data.frame with the following layout:

  • 1 subject per row.
  • 1 column per cell type (5 total, each holding the percentage value for that cell type, summing to 100%).
  • 2 extra columns, one to indicate what Group (experimental or control) the subjects belong to, 1 to indicate what experiment they belong to (1, 2, 3, 4, etc.)

I would like to generate a ggplot2 jitter plot, percentage along the Y-axis, cell type categories along the X-axis (5 total) and further color the data points based on their Group (experimental or control). It would be great if I could further color the data points from different experiments in shades of the Group color (i.e. Experiment number sort of defining a gradient from light to dark - all Experiment-1 points would be light - either red or blue based on which Group they belonged to), but I don't know if that's even possible.

For starters: is my data even layed out properly to attempt to create this plot? The reason I ask is I reall yfeel like I'm fighting ggplot2 in attempting to get anything plotted with the data.frame in its current layout (but the native boxplot() seems to work fine with very little modifications...)

Any help or nudges in the right direction would be greatly appreciated.


EDIT:

This is the output of dput(head(dat, 10)).

structure(list(Neutrophils = c(38, 70.7, 62.1, 90.5, 65.8, 39.2, 
89.4, 91.3, 55.4, 14.5), Lymphocytes = c(47.5, 17.1, 20.3, 2, 
25, 37.1, 6.3, 1.6, 31.3, 61.5), Monocytes = c(12.4, 11.8, 14.6, 
4.8, 7.3, 14.1, 3.7, 4.6, 8.4, 21.9), Eosinophils = c(1.4, 0.1, 
2.5, 2.4, 1.4, 9.2, 0.1, 2.5, 4.6, 1.3), Basophils = c(0.8, 0.3, 
0.5, 0.3, 0.5, 0.4, 0.5, 0, 0.3, 0.8), Group = c(1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L)), .Names = c("Neutrophils", "Lymphocytes", 
"Monocytes", "Eosinophils", "Basophils", "Group"), row.names = c("B145", 
"B196", "B212", "B246", "B250", "B286", "B343", "B355", "B369", 
"B386"), class = "data.frame")
2
A good way to get help is to provide example data. If your data is named dat you could post the result of dput(head(dat,20))Ian Fellows

2 Answers

1
votes

You first need to reshape your data using the melt function in the reshape package.

I'm sure someone will come along with a more elegant way to color points on a gradient but you can do that manually by creating a new column with the colors matched to group and/or experiment. Then map that color aesthetic to that column.

0
votes

It would be great if I could further color the data points from different experiments in shades of the Group color (i.e. Experiment number sort of defining a gradient from light to dark - all Experiment-1 points would be light - either red or blue based on which Group they belonged to), but I don't know if that's even possible.

Not currently, sorry.