3
votes

I have started using ELKI for data analysis, but one seemingly simple thing I cannot seem to do is output the calculated convex hull of clusters to a file after running DBSCAN. I am able to visualize the convex hulls via the visualization gui, but I cannot generate the KML file. I am also able to write my clustering results to a folder (using the ResultWriter resulthandler), but no file is generated when I set the KMLOutputHandler. I receive no error message in the log window (even with verbose parameter set to true).

Is there a trick to generating a KML file in ELKI? Could anyone walk through the steps of doing this?

Any help would be appreciated.

(as an aside, is it possible to generate alpha shapes for DBSCAN results with ELKI? If so, which parameter must be adjusted?)

1

1 Answers

3
votes

So that is actually a lot of questions in one...

  1. Cunvex hulls: they are used in ELKI for visualization, but not considered part of the output result, so they are not saved to file. A trick you could employ is to save the visualization as SVG and extract them from this file, but they will then be in a different coordinate system.

    One of the reasons for this is that the convex hulls are only implemented for 2D Euclidean space - I figure you want to use it for spatial data, where it may actually happen to not return the correct convex hull then due to the curvature of the earth surface. Furthermore, many data sets will be of higher dimensionality.

    However, you can of course look at the source code and invoke the convex hull algorithm, then write the result to your favorite output format. In general, just as you will need to spend time on preprocessing, you will also need to customize the output.

  2. Which brings me to the second question. The KMLResultHandler is closely tied to the publication of ELKI 0.4.0: Spatial Outlier Detection: Data, Algorithms, Visualizations. Which pretty much summarizes what this class does: visualize spatial outlier detection. It currently does not (yet) include code to visualize clusters of spatial data, for example. In order to get an output from the class, you need to ensure a number of restrictions, unfortunately. Essentially, if it finds a Polygon relation and a OutlierResult that it can map to each other, it will output this to KML. It is not yet a class that could write arbitrary results to KML. It probably needs a lot more of documentation, too. Contributions of a more general output tool would be appreciated; but a customizeable, automatic, general output to KML is really hard to do. In particular, you may also end up having to include projection capabilities then, if someone is not processing Latitude-Longitude data, but e.g. UTM projected data. As such, I recommend looking at the source code of the class and customizing it to your needs. In my opinion, visualization to KML will always require a lot of customization.

  3. To generate alpha shapes (only the hull, not the extended alpha shape - the optimal visualization of DBSCAN would likely consist of the alpha shape of the core points only, extended by a radius of epsilon, which should then include the border points. This is on the wish list, but not implemented), you just need to set the -hull.alpha parameter to the desired alpha value. Note that this happens in the visualization projection, not at the raw data. If the axes are scaled differently, alpha shapes will look differently. Again, you may be interested in using the class AlphaShape on the raw data vectors, instead of exporting the projected visualization. Then you can easily write the resulting Polygons to your custom visualization.

If you implement such a KML visualization using alpha shapes (or convex hulls) for clusters, I would appreciate if you could contribute this to ELKI to make it available for others as well. Thank you.