3
votes

Hi i'm converting some 1 min data to 5 min data, and i'm finding it does 4 mins for the first increment, then goes on to do 5 min increments after that.

I've tried messing around with all the "indexAt" parameters but none give me what i want, which is starting from 5, then 10, 15, 20 etc.

i've tried

x5 <- to.minutes5(x)

AND

x <- to.period(x,
          period = 'minutes', 
          k = 5, 
          OHLC = TRUE)

1 min data

                     Open  High   Low Close Volume
2013-01-16 00:01:00 93.55 93.60 93.54 93.58      5
2013-01-16 00:02:00 93.59 93.60 93.58 93.58      5
2013-01-16 00:03:00 93.59 93.60 93.58 93.58      5
2013-01-16 00:04:00 93.58 93.58 93.57 93.57     12
2013-01-16 00:05:00 93.57 93.57 93.55 93.70     21
2013-01-16 00:06:00 93.56 93.56 93.56 93.56      5
2013-01-16 00:07:00 93.56 93.56 93.55 93.55      3
2013-01-16 00:08:00 93.55 93.55 93.55 93.55      2
2013-01-16 00:09:00 93.55 93.56 93.55 93.56      2
2013-01-16 00:10:00 93.56 93.56 93.56 93.56      1
2013-01-16 00:11:00 93.57 93.57 93.57 93.57      3

after converting to 5 min

                    clemtest.Open clemtest.High clemtest.Low clemtest.Close clemtest.Volume
2013-01-16 00:04:00         93.55         93.60        93.54          93.57              27
2013-01-16 00:09:00         93.57         93.57        93.55          93.56              33
2013-01-16 00:14:00         93.56         93.57        93.56          93.57               8
2013-01-16 00:19:00         93.56         93.58        93.51          93.53              77
2013-01-16 00:24:00         93.53         93.55        93.49          93.49             121
2013-01-16 00:29:00         93.49         93.51        93.49          93.51             121

the calculations are correct, its just not starting with the first 5 mins of data, it start with 4 mins, then goes onto 5 mins after that.

(using indexAt='startof' gives me the correct, 5, 10, 15... but when inspecting the bar the 5 min data represents the start of that 5 min (eg min 5 - 10) not min 0 - 5)

Here is the tail of the 1min data for reference.

                         Open  High   Low Close Volume
2013-01-17 23:53:00 95.52 95.52 95.52 95.52      2
2013-01-17 23:55:00 95.51 95.52 95.51 95.52      2
2013-01-17 23:56:00 95.51 95.51 95.51 95.51      1
2013-01-17 23:57:00 95.52 95.52 95.52 95.52      1
2013-01-17 23:59:00 95.52 95.52 95.51 95.51      4
2013-01-18 00:00:00 95.51 95.51 95.51 95.51      8
1
to.period (and therefore to.minutes, to.minutes5, etc) use the endpoints of each interval to aggregate the data. The last observation in the first 5 minutes of your data is at 00:04:00, which is what you see. 00:05:00 is the beginning of the second 5-minute interval in the zero hour. Can you add some detail about why you want to do this?Joshua Ulrich
Take a look at library(highfrequency) package, they have a function called aggregatets() which might be what you are looking for. @user1736644Rime
Hi Joshua, Why i want this is: I'm working with multiple time frames, tick, 5min and 60min all in the same strategy. I would like the end of the 12th 5 min bar to match up with the 60min bar. But if i'm getting 4,9,14 bars, it won't match. And I would like the close of the first 5min bar to equal the close of the 5th 1 min bar, In this case the close of the 5th 1 min bar is 93.70, therefore i would like the close of the first 5min bar to be 93.70 and have a time stamp of 00:05:00. (if i look a the tick data from 00:00:00 - 00:04:59 it starts from 93.55 and finish at 93.70). ThanksGeV 126
Also if you notice the volume, The first 5 mins of the day are from 00:00:00 - 00:05:00 therefore the volume of the first 5 mins should be 5+5+5+12+21(48)GeV 126
If it helps I also added the tail of the 1min data to the original questionGeV 126

1 Answers

1
votes

Is this what you're looking for (using just 12 minutes of data, as that is all you posted)?

x <- read.table(text ="
2013-01-16 00:01:00 93.55 93.60 93.54 93.58      5
2013-01-16 00:02:00 93.59 93.60 93.58 93.58      5
2013-01-16 00:03:00 93.59 93.60 93.58 93.58      5
2013-01-16 00:04:00 93.58 93.58 93.57 93.57     12
2013-01-16 00:05:00 93.57 93.57 93.55 93.70     21
2013-01-16 00:06:00 93.56 93.56 93.56 93.56      5
2013-01-16 00:07:00 93.56 93.56 93.55 93.55      3
2013-01-16 00:08:00 93.55 93.55 93.55 93.55      2
2013-01-16 00:09:00 93.55 93.56 93.55 93.56      2
2013-01-16 00:10:00 93.56 93.56 93.56 93.56      1
2013-01-16 00:11:00 93.57 93.57 93.57 93.57      3")

colnames(x) <- c("Date", "time",  "Open",  "High",   "Low", "Close", "Volume")


xt <- xts(x[, 3:7], order.by = as.POSIXct(paste0(x$Date, x$time, " ")) - 0.000001)



xt5 <- to.period(xt, period = "minutes", k =5)
xt5 <- align.time(xt5, n = 300)
xt5

#                     xt.Open xt.High xt.Low xt.Close xt.Volume
# 2013-01-16 00:05:00   93.55   93.60  93.54    93.70        48
# 2013-01-16 00:10:00   93.56   93.56  93.55    93.56        13
# 2013-01-16 00:15:00   93.57   93.57  93.57    93.57         3

To include the bar with stamp "2013-01-16 00:05:00" in the 5 minute bar that includes the interval ["2013-01-16 00:00:00", "2013-01-16 00:04:59.99999"], you could reduce the underlying time by a tiny amount of a second (a slightly negative quantity, here say -0.000001), so that it is included in the first 5 minute interval.

I think the confusion in your comments is avoided if you make the decision of stating whether the timestamp on the OHLC bar data is at the start of the bar or the end of the bar. i.e. does "2013-01-16 00:01:00" mean the OHLC for the interval (2013-01-16 00:00:00 to 2013-01-16 00:00:59.999) or (2013-01-16 00:01:00, 2013-01-16 00:01:59.999). In your case, it is at the end of the bar (the former case).

And the timestamp being the start of the bar for OHLC data isn't a good idea as it introduces look forward bias when you merge xts objects on different bar intervals together.