0
votes

I would like to draw lines between select points on a ggplot scatterplot.

The data look like this:

data <- data.frame(
  id = c('par1', 'par1', 'par2', 'par3', 'par4', 'par5', 'par6'),
  site = c('site1', 'site1', 'site1', 'site1', 'site2', 'site3', 'site4'),
  age = c('20', '20', '25', '34', '27', '31', '29'),
  target = c('par6', 'par4', NA, 'par5', NA, NA, NA)
)

Basically, if a given id variable has a value in the target variable -- which will also exist in id -- I'd like to draw a line between the values.

The code for the basic scatterplot is:

ggplot(data, aes(x=id, y=age, color=site)) +
  geom_point()

I'd like to modify this so the plot ultimately looks something like the image here.

I believe the answer may lie in the group aes argument and geom_line(), but am not grasping how to proceed.

1

1 Answers

1
votes

geom_segment gives the line segments like you asked for, but you have to find the values for the segments.

Following code gives me the plot like one you described.

library(ggplot2)
library(dplyr)

data <- data.frame(
  id = c('par1', 'par1', 'par2', 'par3', 'par4', 'par5', 'par6'),
  site = c('site1', 'site1', 'site1', 'site1', 'site2', 'site3', 'site4'),
  age = c('20', '20', '25', '34', '27', '31', '29'),
  target = c('par6', 'par4', NA, 'par5', NA, NA, NA)
)

plt <- ggplot() +
  geom_point(data, mapping=aes(x=id, y=age, color=site))

new_d <- data %>% select(id,target) %>% filter(!is.na(target))
new_d$y <- data$age[match(new_d$id,data$id)]
new_d$ey <- data$age[match(new_d$target,data$id)]

plt + geom_segment(mapping=aes(x=new_d$id,y=new_d$y,xend=new_d$target,yend=new_d$ey))

Outputs this

ggplot output