0
votes

So I have some long data (one observation per row) of different treatments at different time points. I would like to make the plot shown below where each line represents a treatment over time. The line should be the mean. I have some experience with ggplot but my knowledge only allows me to only make this graph when there is only single observations for each treatment group at each time step (geom_line).

enter image description here

It is a graph with time on the x axis, y for the response value and then the treatments are each in a different colour (with 3 dots at each time for each treatment (n=3) and a line following the average of these three values).

1
Please provide an example of your data in order to better understand what needs to be done. Also you can provide a link to your image, and someone will edit your question to display the image.Phil
Thank you, it is from a paper I was reading. Here is the image: imgur.com/h8KfyfpArtically
Please, add an example to clarify your question.hamed baziyad
Uploaded your image (once edit is approved). Please, can you share your dataset in the question as well? Make sure you share in a format that can be replicated, such as typing dput(your.data) in the console and pasting the output in the body of your question (formatted as code, please). Finally, you mentioned you were able to create a plot - can you share that too? If not, at least share the code used to create it and we can help.chemdork123

1 Answers

0
votes

There are two approaches you can use, depending on your data set. In the absence of your data, I'll show you both using the iris dataset. It's not the best example dataset for this, but... it works to demonstrate these methods.

First, the basic plot:

p <- ggplot(iris, aes(Sepal.Length, Sepal.Width, color=Species)) + theme_bw() +
  geom_line()
p

enter image description here

So there's the general way to generate the plot. From your description of your dataset, it seems you have an x, y, and treatment column that should correspond to the code above as iris$Sepal.Length, iris$Sepal.Width and iris$Species, respectively.

Method 1: Overlay a Plot of Aggregate Data

The first method has more prep work upfront, but sometimes it's just straightforward to use. The idea is that you create a separate dataframe holding your summary data and then plot that as a separate call to geom_line(). The advantage of this method is that you have access directly to the summary data, and it is therefore relatively easy to access an individual data point among that dataset. So, if you plan to do any further plotting or analysis of the summary data... this might be the way to go.

Create your summary dataset. There are lots of methods to do this - here I'm using dplyr methods:

library(dplyr)

iris_summary <- iris %>%
  group_by(Sepal.Length) %>%
  summarize(mean_width=mean(Sepal.Width))

Then create the plot by adding another call to geom_line(), but specifying the data= field to correspond to the summary dataset, iris_summary. In addition, you will need to reference the y= aesthetic, since it is now called iris_summary$mean_width instead of iris_summary$Sepal.Width:

p1 <- p +
  geom_line(
    data=iris_summary,
    aes(y=mean_width),
    color='black', size=1
  )
p1

enter image description here

Method 2: Use stat_summary()

The second method is very straightforward, but does not make it simple to access and do anything with the aggregated data. If all you want to do is plot that summary data as a line, then this is definitely the simplest.

p2 <- p +
  stat_summary(
    geom='line', fun=mean, aes(group=1),
    size=1, color='black'
  )
p2

I'm not showing the plot, since the result is identical to that of p1 using Method 1 above.