0
votes

I have a time-series data with corresponding variable with either increase or decrease from the previous value within some range say +- 10%. There are data points within the time-series that does not go along the previous or later values in the time-series.

For example:

time       v1
13:01:30   0.689
13:01:31   0.697
13:01:32   0.701
13:01:33   0.713
**13:01:34   0.235**
13:01:35   0.799
13:01:36   0.813
13:01:37   0.822 
**13:01:38   0**
13:01:39   0.865
13:01:40   0.869

Is there any library that might help in identifying these outlier values[0.235 and 0 in data] in R?

update - output of dput:

structure(list(time = c("13:01:30", "13:01:31", "13:01:32", "13:01:33", 
"13:01:34", "13:01:35", "13:01:36", "13:01:37", "13:01:38", "13:01:39", 
"13:01:40"), v1 = c(0.689, 0.697, 0.701, 0.713, 0.235, 0.799, 
0.813, 0.822, 0, 0.865, 0.869)), .Names = c("time", "v1"), row.names = c(NA, 
11L), class = c("tbl_df", "tbl", "data.frame"))
1
@akrun - admittedly, outliers on a set of data is different to localised outliers. For this simplified example, they will give the same results, but something examining residuals on an lm fit might even be worthwhile, or diff comparisons...thelatemail
@thelatemail I reopened the postakrun

1 Answers

1
votes

This may help (as a template)

# load packages
library(ggplot2)   # 2.0.0
library(ggrepel)   # 0.4
library(dplyr)     # 0.4.3

# make data_frame of OP data
ts_tdf <- data_frame(
    time = paste("13", "01", 30:40, sep = ":"),
    v1 = c(0.689, 0.697, 0.701, 0.713, 0.235, 0.799, 0.813, 0.822, 0.00, 0.865, 0.869)   
)

# calculate measure of central tendency (I like median)
v1_median <- median(ts_tdf$v1)

# create absolute deviation column, identify (n = 10) largest outliers, plot (sorted) values of new column 
ts_tdf %>%
    mutate(abs_med = abs(v1 - v1_median)) %>%
    arrange(-abs_med) %>%
    head(n = 10) %>%
    mutate(char_time = as.character(time)) %>%
    ggplot(data = ., aes(x = 1:nrow(.), y = abs_med, label = char_time)) +
    geom_point() + 
    geom_text_repel()