I have been using the Mahout library to implement a recommendation algorithm. I have used the EuclideanDistanceSimilarity
class and so far my results seem fine.
My DataModel
currenty consists of 500 ratings for 100 items which are rated on a scale of 1 to 5 such as
customer itemID rating
____1 ____2_____8
However the Apache Mahout API's states that "Note that the distance isn't normalized in any way; it's not valid to compare similarities computed from different domains (different rating scales, for example). Within one domain, normalizing doesn't matter much as it doesn't change ordering."
Will this impact the validity/reliability of my results as I capture more customers and items?