I am processing a high frequency stream of timestamped events with no ordering guarantee (ordered 90% of the time). I need to store these events (for caching purpose) for some times in my program. In order to optimize the performances of my computations (that require mainly iteration over the events collection) it would be much more easier for me if I could guarantee the order by caching an ordered list. So what I am looking for is an ordered data structure that is fast in insertion and iteration and allow duplicates.
Among all propositions I have found on internet, I have tried :
- TreeSet -> Does not work because I might have duplicate timestamp
- PriorityQueue -> Does not work because the iterator does not guarantee priority order
Since 9/10 events are well ordered I thought I could use a basic ArrayList with a modified version of the add method :
public class TimeOrderedArrayList<E> extends ArrayList<E>{
private long lastTs;
private Comparator<E> comparator;
private TimeGetter<E> tsgetter;
public TimeOrderedArrayList (Comparator<E> comparator, TimeGetter<E> tsgetter) {
super();
this.comparator = comparator;
this.tsgetter = tsgetter;
this.lastTs = Long.MIN_VALUE;
}
@Override
public boolean add(E e) {
if (tsgetter.getTime(e) >= lastTs) {
lastTs = tsgetter.getTime(e);
return super.add(e);
} else {
// VERSION 1
int index = super.size()-1;
while (tsgetter.getTime(super.get(index))>tsgetter.getTime(e) && index > 0) {
index--;
}
super.add(index, e);
// VERSION 2
int index = Collections.binarySearch(this, e, comparator);
super.add(index>-1 ? index : -index-1,e);
return true;
}
}
@Override
public boolean addAll(Collection<? extends E> c) {
boolean result = super.addAll(c);
super.sort(comparator);
return result;
}
}
But for both version I get really bad performances.
Any suggestions ?
LinkedList
based implementation will perform better thanArrayList
, as random inserts are far less expensive. – Brett Okken