How do I compute the cosine similarity distance of two documents in Perl? A few questions: 1) Are there already modules for computing the cosine similarity distance in CPAN? Or is this task easy enough to code up? 2) When I say documents, I really mean that one "document" is a sentence and the other "document" is just a list of keywords. To be fair, should I tokenize, lowercase and sort all the keywords in each document respectively before computing the cosine similarity distance?