0
votes

I am trying to reconstruct a brain tumor image after clustering using hdbscan.

However, hdbscan does not have cluster centers unlike kmeans so I am a bit confused on how to obtain the clustered image. I have tried obtaining the ref cluster center by matching the (65536,3) array with the hdbscan labels i.e. r and storing them after getting the mean cluster points for each cluster in crs.

I am unsure if this is the best way to proceed to reconstruct an image that is, get some mean centers based on clusters and reconstruct the image using the mean centers plus labels.

crs = np.zeros((dbnumber_of_clusters, 3))
for i in range(0, dbnumber_of_clusters):
    dbcluster_points = mriarr[r == i]
    dbcluster_mean = np.mean(dbcluster_points, axis=0)
    crs[i, :] = dbcluster_mean
1

1 Answers

0
votes

HDBSCAN is not designed to "reconstruct" data. So there may not be an elegant way.

Using the mean of each cluster is an obvious choice wrt. simulating what k-mrans does, but such a point may lie outside the actual cluster if a cluster is not convex. So it may be appropriate to choose the most dense point instead. Furthermore, the clustering is supposed to be hierarchical, so when computing a cluster representative, you should also take the data of nested clusters into account... Last but not least, it can produce a "noise cluster". That is not actually a cluster, but simply all the unclustered data. Computing a single representative object of such points is not meaningful. Instead, you probably want to treat these points as each point bring it's own cluster.