This is similar to several other questions, but I've still not managed to get it to work. I'm trying to apply a rolling function (runMAD
from the TTR package) to a series of financial info (trading price and volume), but the rolling operation should all be intraday (i.e. not cross into the next day, nor from the previous day).
The data is stored as an xts object, with two columns (price and volume). I split this object by day to create a list of xts objects by day, hoping I could then apply the runMAD
function over each day, and return a list of the xts objects with four columns (the two original price and volume columns, and then two new ones for runMAD-price, and runMAD-volume). It seems, however, that lapply
can only return a list of length equal to the number of days - so a function like a daily mean would work, but not sure how to get rolling functions to work which throw up multiple results per day. Ultimately, I'd like to test each row of price and volume to see if its deviation from the median is, say, 3 times greater than its respective rolling MAD. Such instances would then be saved in a variable (with the xts index).
sample data:
2011-01-02 09:11:00 20.20 20000
2011-01-02 10:42:00 20.40 50000
2011-01-02 12:33:00 20.90 5000
2011-01-02 12:33:00 20.90 10000
2011-01-02 13:25:00 20.90 10000
2011-01-02 15:36:00 20.90 50000
2011-01-02 15:39:00 20.90 50000
2011-01-02 16:12:00 20.90 15000
2011-01-02 16:33:00 20.90 50000
2011-01-02 17:04:00 20.90 50000
2011-01-02 17:05:00 20.90 50000
2011-01-02 18:26:00 20.90 10000
2011-01-03 09:21:00 20.20 20000
2011-01-03 11:19:00 20.40 50000
2011-01-03 17:33:00 20.90 5000
2011-01-06 12:33:00 20.90 10000
2011-01-06 13:25:00 20.90 10000
2011-01-06 15:36:00 20.90 50000
2011-01-06 15:39:00 20.90 50000
2011-01-06 16:20:00 20.90 15000
2011-01-06 16:20:00 20.90 50000
2011-01-06 17:24:00 20.90 50000
2011-01-06 17:55:00 20.90 50000
2011-01-06 18:49:00 20.90 10000
[EDIT: Following GSee's comment, here is a larger (& fixed) sample dput of the data:]
dput(head(sample.data,200))
structure(c(23.48, 19.08, 22.43, 21.75, 19.78, 19.76, 23.11, 23.31, 19.58, 19.28, 23.68, 23.47, 22.13, 23.17, 24.53, 22.79, 20.28, 19.73, 22.05, 19.23, 22.40, 23.43, 20.40, 23.58, 19.94, 19.58, 21.70, 20.65, 21.69, 24.09, 22.21, 21.42, 19.48, 24.20, 23.37, 24.68, 21.64, 23.24, 23.59, 19.61, 21.89, 23.21, 19.47, 24.77, 19.64, 19.61, 21.07, 24.68, 20.20, 23.37, 24.47, 22.55, 19.64, 22.73, 23.94, 21.31, 19.69, 19.51, 21.27, 20.47, 23.67, 23.42, 20.56, 20.60, 22.23, 22.16, 20.78, 21.47, 24.82, 19.34, 19.08, 24.92, 24.12, 20.46, 21.67, 22.25, 19.16, 22.47, 24.00, 23.32, 21.13, 23.67, 20.35, 20.80, 23.48, 21.22, 21.51, 21.91, 24.06, 22.17, 24.23, 22.43, 24.94, 20.99, 24.93, 22.48, 21.74, 19.18, 21.14, 21.89, 20.23, 24.80, 22.81, 20.96, 24.30, 21.11, 23.69, 23.20, 20.09, 23.09, 21.56, 20.93, 22.97, 24.53, 23.96, 19.33, 24.63, 19.27, 19.57, 21.40, 24.04, 22.37, 24.95, 21.37, 24.01, 20.42, 21.82, 19.06, 23.08, 20.37, 21.28, 19.87, 21.84, 20.95, 20.89, 24.59, 19.18, 24.26, 20.64, 22.12, 20.85, 24.31, 22.55, 24.32, 19.90, 24.58, 24.13, 21.55, 21.06, 20.84, 20.68, 23.47, 19.67, 21.82, 23.31, 19.67, 19.45, 24.56, 20.06, 24.46, 19.89, 21.57, 20.21, 24.20, 20.53, 19.91, 23.67, 22.95, 21.18, 21.57, 20.01, 19.36, 20.65, 19.75, 22.12, 19.79, 21.07, 24.77, 21.46, 20.64, 21.94, 19.81, 23.58, 22.48, 21.03, 19.37, 19.46, 22.68, 24.56, 21.40, 21.94, 23.27, 21.26, 20.34, 23.39, 21.97, 22.25, 22.40, 23.44, 24.38, 5900, 2400, 3100, 1200, 5100, 5700, 3400, 4900, 3900, 6400, 2600, 2200, 2800, 6200, 1100, 2500, 3500, 5900, 6600, 5100, 1800, 4200, 4500, 1200, 6200, 1400, 2800, 2400, 1500, 900, 2900, 3200, 5800, 3500, 4500, 4500, 4500, 6800, 5000, 4900, 5300, 6100, 3400, 1500, 6200, 1500, 5400, 5800, 800, 6300, 5000, 3400, 5500, 5600, 2100, 3700, 4400, 3100, 4600, 2700, 2700, 2700, 2600, 1200, 1700, 2900, 5300, 4900, 2600, 3100, 4000, 6000, 2500, 3000, 6500, 3600, 5900, 2300, 1600, 1300, 4800, 3800, 4000, 1500, 5600, 5500, 1500, 3100, 1300, 3200, 1500, 1400, 1400, 4600, 5300, 5600, 1200, 1500, 6100, 2200, 6700, 1700, 4000, 6200, 2700, 1100, 4500, 5400, 1600, 2600, 5700, 4100, 7000, 2300, 2600, 5100, 4500, 1900, 2400, 2600, 6900, 2100, 3500, 3600, 2900, 900, 2800, 2200, 4900, 1700, 3000, 800, 2500, 6100, 3300, 6500, 2600, 6200, 900, 4000, 6600, 6400, 1000, 4800, 6900, 4000, 2800, 1200, 6200, 2700, 1300, 1200, 2800, 1300, 5900, 4000, 3900, 5300, 2400, 4700, 6900, 2900, 6200, 3300, 6900, 2200, 1800, 5500, 2000, 6400, 3500, 5300, 5000, 6700, 4500, 3200, 1700, 3300, 5000, 5800, 4900, 2100, 1200, 5200, 1000, 900, 2600, 1000, 5500, 3400, 1600, 4700, 1500, 5900, 6900, 3700, 4900, 5700, 900, 2400), class = c("xts", "zoo"), .indexCLASS = c("POSIXlt",
"POSIXt"), .indexTZ = "", tclass = c("POSIXlt", "POSIXt"), tzone = "", index = structure(c(1325584080, 1325594940, 1325594940, 1325604600, 1325759100, 1325762520, 1325762520, 1325769300, 1325769300, 1325848080, 1325864880, 1326128220, 1326196500, 1326196500, 1326196500, 1326196500, 1326196500, 1326196500, 1326209700, 1326279480, 1326283620, 1326288300, 1326288300, 1326289680, 1326289680, 1326289680, 1326292320, 1326294060, 1326294600, 1326297600, 1326387000, 1326456720, 1326467160, 1326711600, 1326723000, 1326724260, 1326809940, 1326814860, 1326885960, 1326885960, 1326889980, 1326894000, 1326895200, 1326895200, 1326898080, 1326986700, 1326987240, 1326992100, 1327072140, 1327328040, 1327328040, 1327328040, 1327417920, 1327423140, 1327424820, 1327425240, 1327483200, 1327496520, 1327570320, 1327570320, 1327575420, 1327588680, 1327588980, 1327595880, 1327595880, 1327595880, 1327664820, 1327674720, 1327680660, 1327680780, 1327680780, 1327683960, 1327914300, 1327914300, 1327915260, 1327918140, 1327924860, 1327924920, 1327924980, 1327924980, 1327927680, 1328013360, 1328014200, 1328025000, 1328025000, 1328026740, 1328089440, 1328091360, 1328091360, 1328110620, 1328111340, 1328111340, 1328112420, 1328113800, 1328193540, 1328194080, 1328194140, 1328196720, 1328274360, 1328274420, 1328278320, 1328519280, 1328520120, 1328520600, 1328520600, 1328524140, 1328527980, 1328531580, 1328540880, 1328540880, 1328547600, 1328547660, 1328547720, 1328547780, 1328607060, 1328608080, 1328618760, 1328623380, 1328623380, 1328625720, 1328631480, 1328717760, 1328717880, 1328793000, 1328797980, 1329132840, 1329210480, 1329215400, 1329215820, 1329215820, 1329219480, 1329223140, 1329300900, 1329301620, 1329315240, 1329315240, 1329388740, 1329389700, 1329390000, 1329390000, 1329390180, 1329391860, 1329391860, 1329391860, 1329402120, 1329467700, 1329467700, 1329469080, 1329469080, 1329471300, 1329477000, 1329477000, 1329489840, 1329732660, 1329732660, 1329733080, 1329745980, 1329745980, 1329750120, 1329750120, 1329750780, 1329753780, 1329753780, 1329753900, 1329754080, 1329754260, 1329755520, 1329755520, 1329755820, 1329755880, 1329755880, 1329756000, 1329815040, 1329815040, 1329826380, 1329840000, 1329841200, 1329841200, 1329842220, 1329901080, 1329901200, 1329901200, 1329901680, 1329901680, 1329902040, 1329902100, 1329902100, 1329902100, 1329902340, 1329902340, 1329902340, 1329902520, 1329902880, 1329902880, 1329906720, 1329906840, 1329908460, 1329910980, 1329911880, 1329914400), tzone = "", tclass = c("POSIXlt", "POSIXt")), .Dim = c(200L, 2L), .Dimnames = list(NULL, c("sample.dataprice", "sample.datavol")))
There are often multiple trades in the same time period, and in such cases I'd like the runMAD
to view each one separately if it doesn't already. Also, I realise that the window of runMAD
might be larger than the number of trades in a particular day, and in such cases I'd like to ignore (NA) that day - or if possible reduce the runMAD
window size dynamically to the number of trades in that day.
The str of my data:
str(sample.data)
An ‘xts’ object on 2012-01-03 09:11:00/2012-03-30 18:49:00 containing:
Data: num [1:459, 1:2] 22.6 22.5 22.5 22.5 22.8 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "sample.dataprice" "sample.datavol"
Indexed by objects of class: [POSIXlt,POSIXt] TZ:
xts Attributes:
NULL
I split
this into days and then runMAD
:
sample.data.days <- split.xts(sample.data, f="days")
require(TTR)
runMAD(sample.data.days, n=6, stat="median", cumulative=FALSE, constant=1)
*Error in runMedian(x, n, cumulative = cumulative) : (list) object cannot be coerced to type 'double'*
To try to avoid applying runMAD
to days where there are fewer observations than the window length, I keep only the days with sufficient observations:
sample.data.days6<- sample.data.days[vapply(sample.data.days, FUN=nrow, FUN.VALUE=0L)>6]
I've also tried this:
do.call(runMAD, lapply(as.numeric(index(sample.data.days6)), function(x) runMAD(sample.data.days6, constant=1)))
But it throws the same error as above (Error in runMedian(x, n, cumulative = cumulative) : (list) object cannot be coerced to type 'double').
Instead of putting the data into a list to break into days, I've also tried this:
apply.daily(sample.data, function(d){
c(runMAD_price = runMAD(d$sample.data[, 1], n=6),
runMAD_volume = runMAD(d$sample.data[, 2], n=6),
)
})
*Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), : 'data' must be of a vector type
A slight variation of this question is to see how the same could be done for a moving time window (e.g. 1 hour), instead of a moving window of trades.
Whether the window is time or observation-based, I'd like to eventually create an xts object that contains all observations for which the deviation of the value (price or volume) from a certain window's median, divided by the MAD (at mid-window), is above a certain threshold (e.g. 3). Since the MAD will be 0 when half of the observations are the same, I'd like to replace any 0 with any previous MAD>0.
do.call(rbind, lapply(split(x, "days"), runMAD))
, but it would be easier to show that if you gave data. – GSeerunMAD
– GSeedput
:Error: unexpected ')' in ""sample.datavol")"
. Anyway, are you still unclear on how to do what you want? – GSee