1
votes

I'm having some trouble getting my data into ELKI properly to run through the Optics algorithm, but using the R implementation of Optics, I can easily get the list of reachability distances. I can write them to a file like this. (index, distance)

1 Inf

2 0.5

3 0.9 ...

I want to find clusters that are separated by local minimums. I think that the Optics Xi algorithm from the ELKI software does this, but since I'm having trouble importing my data, I can't use it.

Would it be easier to just write the Optics cluster extracting algorithm in another language using the pseudo-code from the Optics paper? I think at its most basic, it is just clustering indices together when they're next to each other, and there is no decrease to a new local minimum.

Thanks

1
I bet there is a Java interface that you just need to implement. In ELKI, everything seems to have an interface to plug into. But it may be easier to fix your "some trouble", as ELKI works really well (make sure to enable an index, and to set an upper bound on epsilon - this speeds up a lot). This appears to be the only usable OPTICS implementation.Has QUIT--Anony-Mousse

1 Answers

2
votes

You could try to implement the interface OPTICSTypeAlgorithm, which largely means reading your data and storing it in an object of type ClusterOrder.

However, ELKI includes a slightly advanced version of OPTICS that will produce better results with OPTICSXi. The details will eventually be published as a tech report or so. The data you got from the R implementation is not enough to correct for some common artifacts in the OPTICS plot.

Please use the OPTICS version in ELKI. Try the Cover tree index (which is quite fast and easy to use). Avoid using ID columns in your input data, or tell the parser which column is the ID column.