2
votes

I have a matrix A = (a1,a2,a3,...,an)',where a1, a2,..., an are row vectors. I want to apply k-means algorithm to matrix A in order to cluster the row vector ai (i=1,2,3...,n) to k clusters or more. Suppose b1, b2, b3,...,bk are the centers of k clusters, k samples are randomly selected to be the initial centers of k clusters. All the samples (a1,a2,a3,...,an) are classified according to their cosine distance to the centers bi (i=1,2,3,...,k) into k classes, that is, k clusters. The centers of k clusters are recalculated, all samples are reclassified until the centers do not change, and then the final centers b1,b2,b3,...,bk are obtained. For each cluster, only the vector closest to the center of cluster is retained. How to realize this?

1
Are you required to write all of this yourself, or are you allowed to use libraries? - needarubberduck
@It's Magic, I am allowed to use libraries. - Shawn

1 Answers

2
votes

The kmeans function (in the Statistics and Machine learning toolbox) performs exactly this. Simply use:

C = kmeans(A, k, 'Distance', 'cosine')

to get the desired output.

Best,