I'm working on an item-based CF which uses an adjusted cosine similarity. I have recently added a regular cosine similarity and I got totally different results. Now my question is which fits better considering my data?
This is a possible scenario of users, items and ratings
User 1 | User 2 | User 3 | User 4 | User 5
Item 1 | 5 | 1 | 1 | 5 | 5
Item 2 | 5 | 1 | 2 | 4 | 5
Item 3 | 1 | 5 | 4 | 2 | 3
Considering this data, you'd conclude that item 1 and item 2 are relatively 'similar'. Here are the results of the different similarity coefficients:
Similarity between Item 1 and Item 2Adjusted cosine similarity = 0.865
Regular cosine similarity = 0.987
I rounded them off for this example
You can see this is basically the same, but when you try to calculate a similarity between Item 2 and 3 (Which aren't similar at all) it results in totally different results:
Similarity between Item 2 and Item 3Adjusted cosine similarity = -0.955
Regular cosine similarity = 0.656
I rounded them off for this example
Which of these would be 'better'? I assume using an adjusted cosine similarity works better since it take the average rating of the user into account, but why would a regular cosine similarity result in a positive number for such 'different' items? Should I just refrain from using the regular cosine similarity in general or only for certain scenarios?
Any help would be appreciated!