0
votes

I want to ask advice about the DBSCAN clustering algorithm. I am using it on latitude & longitude matrix data from a seismic catalogue. My question is which evaluation criteria are appropriate to find the correct number of clusters produced by DBSCAN? I am working on Matlab, and I am using the GAP ('elbow') evaluation criterion with k-means, but I read that it may not be appropriate, since k-means does not work well with density based clustering. Also, the Matlab implementation of DBSCAN has two outputs, the type & class. Could someone tell me what is the class output? I think it is assigning data points to respective clusters but I am not sure. Any help would be appreciated, thank you, Dennis

1
Please add a minimal reproducible example so we can take a look and try to help you. - Adriaan
This looks like a question for academic work, for such general questions it's better to ask your supervisor/coworkers who've used this algorithm before for help and information. - Adriaan
To be honest I do not know if I can do that. My code seems to be working but for the time being is a mess and rather long. However, as soon as I finish the background studying I will ask more focused questions. Thank you. - Dennis M

1 Answers

0
votes

Most validation methods do not work with noise (i.e. DBSCAN).

You should try

Moulavi, D., Jaskowiak, P. A., Campello, R. J. G. B., Zimek, A., & Sander, J. (2014). Density-based clustering validation. In Proceedings of the 14th SIAM International Conference on Data Mining (SDM), Philadelphia, PA.

which is the only approach that I am aware of that is designed for density-based clusters. I have not yet tried it though, I prefer manual evaluation.

Instead of DBSCAN, also try OPTICS, and HDBSCAN*.