1
votes

I have a problem connecting two points with the same y value. My dataset looks like this (I hope the formatting is ok):

attackerip,min,max
125.88.146.123,2016-03-29 17:38:17.949778,2016-03-30 07:28:47.912983
58.218.205.101,2016-04-05 15:53:20.69986,2016-05-12 17:32:08.583255
183.3.202.195,2016-04-05 15:58:27.862509,2016-04-15 18:15:13.117774
58.218.199.166,2016-04-05 16:09:34.448588,2016-04-24 06:02:12.237922
58.218.204.107,2016-04-05 16:57:17.624509,2016-05-31 00:52:44.007908

What I have so far is the following:

mydata = read.csv("timeline.csv", sep=',')
mydata$min <- strptime(as.character(mydata$min), format='%Y-%m-%d %H:%M:%S')
mydata$max <- strptime(as.character(mydata$max), format='%Y-%m-%d %H:%M:%S')
plot(mydata$min, mydata$attackerip, col="red")
points(mydata$max, mydata$attackerip, col="blue")

Which results in: this Plot

Now I want to connect the points with the same y-axis value. And can not get lines or abline to work. Thanks in Advance!

EDIT: dput of data

dput(mydata)
structure(list(attackerip = structure(c(1L, 5L, 2L, 3L, 4L), .Label = c("125.88.146.123", 
"183.3.202.195", "58.218.199.166", "58.218.204.107", "58.218.205.101"
), class = "factor"), min = structure(1:5, .Label = c("2016-03-29 17:38:17.949778", 
"2016-04-05 15:53:20.69986", "2016-04-05 15:58:27.862509", "2016-04-05 16:09:34.448588", 
"2016-04-05 16:57:17.624509"), class = "factor"), max = structure(c(1L, 
4L, 2L, 3L, 5L), .Label = c("2016-03-30 07:28:47.912983", "2016-04-15 18:15:13.117774", 
"2016-04-24 06:02:12.237922", "2016-05-12 17:32:08.583255", "2016-05-31 00:52:44.007908"
), class = "factor")), .Names = c("attackerip", "min", "max"), class = "data.frame", row.names = c(NA, 
-5L))

Final Edit:

The reason why plotting lines did not work was, that the datatype of min and max was timestamps. Casting those to numeric values yielded the expected result. Thanks for your help everyone

1
Will edit it in but that looks even worse.Wirsiing

1 Answers

3
votes

The lines function should work just fine. However, you will need to call it for every pair (or set) of points that share the same y value. Here is a reproducible example:

# get sets of observations with the same y value
dupeVals <- unique(y[duplicated(y) | duplicated(y, fromLast=T)])
# put the corresponding indices into a list
dupesList <- lapply(dupeVals, function(i) which(y == i))

# scatter plot
plot(x, y)
# plot the lines using sapply
sapply(dupesList, function(i) lines(x[i], y[i]))

This returns

enter image description here

data

set.seed(1234)
x <- sort(5* runif(30))
y <- sample(25, 30, replace=T)

As it appears that you have two separate groups for which you would like draw these lines, the following would be the algorithm:

  1. for each group, (min and max, I believe)
    • calculate the duplicate values of the y variable
    • put the indicies of these duplicates into a dupesList (maybe dupesListMin and dupesListMax).
  2. plot the points
  3. run one sapply function on each dupesList.