10
votes

I have some data and also the pairwise distance matrix of these data points. I want to cluster them using Agglomerative clustering. I readthat in sklearn, we can have 'precomputed' as affinity and I expect it is the distance matrix. But I could not find any example which uses precomputed affinity and a custom distance matrix. Any help will be appreciated.

1

1 Answers

19
votes

Let's call your distance matrix D.

agg = AgglomerativeClustering(n_clusters=5, affinity='precomputed', linkage = 'average')
agg.fit_predict(D)  # Returns class labels.

If you're interested in generating the entire hierarchy and producing a dendrogram, scikit-learn's API wraps the scipy hierarchical clustering code. Just use the scipy code directly.