1
votes

Lets say I have a database of users who rate different products on a scale of 1-5. Our recommendation engine recommends products to users based on the preferences of other users who are highly similar. My first approach to finding similar users was to use Cosine Similarity, and just treat user ratings as vector components. The main problem with this approach is that it just measures vector angles and doesn't take rating scale or magnitude into consideration.

My question is this: Can somebody explain to me why Cosine Similarity is in any way better suited for judging user similarity than simply measuring the percent difference between the vector components of two vectors(users)?

For Example, why not just do this:

n = 5 stars
a = (1,4,4)
b = (2,3,4)

similarity(a,b) = 1 - ( (|1-2|/5) + (|4-3|/5) + (|4-4|/5) ) / 3 = .86667

Instead of Cosine Similarity :

a = (1,4,4)
b = (2,3,4)

CosSimilarity(a,b) = 
(1*2)+(4*3)+(4*4) / sqrt( (1^2)+(4^2)+(4^2) ) * sqrt( (2^2)+(3^2)+(4^2) ) = .9697
1
This is a good candidate for datascience.stackexchange.com - Sean Owen

1 Answers

1
votes

I suppose one answer is that not all recommender problems operate on ratings on a 1-5 scale, and not all operate on the original feature space at all, but sometimes on a low-rank feature space. The answer changes there.

I don't think cosine similarity is a great metric for ratings. Ratings aren't something you want to normalize away. It makes a bit more sense if you normalize each users' ratings to have mean 0.

I'm not sure it's optimal to use this sort of modified L1 distance either. Consider normal Euclidean / L2 distance too. In the end, empirical testing will tell you what works best for your data.