I have a matrix A = (a1,a2,a3,...,an)'
,where a1, a2,..., an
are row vectors. I want to apply k-means
algorithm to matrix A
in order to cluster the row vector ai
(i=1,2,3...,n)
to k
clusters or more. Suppose b1, b2, b3,...,bk
are the centers of k
clusters, k
samples are randomly selected to be the initial centers of k
clusters. All the samples (a1,a2,a3,...,an
) are classified according to their cosine distance to the centers bi (i=1,2,3,...,k)
into k
classes, that is, k
clusters. The centers of k
clusters are recalculated, all samples are reclassified until the centers do not change, and then the final centers b1,b2,b3,...,bk
are obtained. For each cluster, only the vector closest to the center of cluster is retained. How to realize this?
2
votes
Are you required to write all of this yourself, or are you allowed to use libraries?
- needarubberduck
@It's Magic, I am allowed to use libraries.
- Shawn
1 Answers
2
votes
The kmeans
function (in the Statistics and Machine learning toolbox) performs exactly this. Simply use:
C = kmeans(A, k, 'Distance', 'cosine')
to get the desired output.
Best,