1
votes

I have been searching for an answer to this for a while without much luck so fingers crossed someone can help me!

I am dealing with cyclical data and I am trying to find the associated value of the two peaks and two troughs - this doesn't necessary equate to the max/min and second max/min values but rather the max/min and then the second max/min values conditional on the value being larger/smaller than the preceding and subsequent values.

This is an example of one cycle

x <- c(3.049, 3.492, 3.503, 3.429, 3.013, 2.881, 2.29, 1.785, 1.211, 0.890, 0.859, 0.903, 1.165, 1.634, 2.073, 2.477, 3.162, 3.207, 3.177, 2.742, 2.24, 1.827, 1.358, 1.111, 1.063, 1.098, 1.287, 1.596, 2.169, 2.292)

I have 1000's of cycles so I am using group_by in dplyr to group the cycles and then hoped to apply the conditional max/min argument within groups.

I would appreciate any advice with this,

Thanks in advance

Edit

I have since used the below function with just a slight edit on the last line

  return(data.frame(Data.value=x, Time=y, Date=z,HHT=peak, LLT=trough)) 

where x is my original x above, y is a time var and z is a date var. This allowed me to do some extra calculations on the results (I needed the time at which the value was min/max as well as the value itself).

So now I have a dataframe with everything I need but it is only for one date - I still can't get this run through the whole dataset using the group_by function. I have tried sub-setting by date using

subsets<-split(data, data$datevar, drop=TRUE)

But still need a way to somehow run the findminmax function (and my few extra lines of code) for each subset. Any ideas?

1

1 Answers

0
votes

Consider the following custom function that you can pass in a dplyr group_by() procedure. Essentially, function iterates through list of cyclical values and compares neighbor before and after it. Peaks would have neighbors both lower than itself and troughs with neighbors larger than iteself.

findminmax <- function(x){
  peak <- list(NA, NA)                              # INITIALIZE TEMP LISTS AND ITERATORS
  p <- 1
  trough <- list(NA, NA)
  t <- 1

  for (i in 1:length(x)){
    if (i != 1 & i != length(x)){                   # LEAVES OUT FIRST AND LAST VALUES
      if ((x[i] > x[i-1]) & (x[i] > x[i+1])) {      # COMPARES IF GREATER THAN NEIGHBORS
        peak[p] <- x[i]
        p <- p + 1
      }
      if ((x[i] < x[i-1]) & (x[i] < x[i+1])){       # COMPARES IF LESS THAN NEIGHBORS
        trough[t] <- x[i]
        t <- t + 1
      }
    }
  }
  return(list(peak1=peak[[1]], peak2=peak[[2]], 
              trough1=trough[[1]], trough2=trough[[2]]))
}

result <- findminmax(x)
#$peak1
#[1] 3.503    
#$peak2
#[1] 3.207    
#$trough1
#[1] 0.859    
#$trough2
#[1] 1.063

For dplyr's group_by:

finaldf <- originaldf %>% 
             group_by(z) %>% 
                summarise(Time = mean(y),
                          HHT1 = findminmax(x)$peak1,
                          HHT2 = findminmax(x)$peak2,
                          LLT1 = findminmax(x)$trough1,
                          LLT2 = findminmax(x)$trough2)