3
votes

I'm trying to use Apache Mahout to create an Item-based recommender that recommends back items based off of similar items that other users also have in common.

I start by creating a DataModel and then I've tried passing it into various different ItemSimilarity objects:

// Create data model
DataModel datamodel = new FileDataModel(new File("input.csv"));

// ItemSimilarity object
// ItemSimilarity similarity = new EuclideanDistanceSimilarity(datamodel);
// ItemSimilarity similarity = new PearsonCorrelationSimilarity(datamodel);
ItemSimilarity similarity = new CityBlockSimilarity(datamodel);

Then I pass the DataModel and ItemSimilarity into a GenericItemBasedRecommender and call the mostSimilarItems() function and pass it into a list.

ItemBasedRecommender irecommender = new GenericItemBasedRecommender(datamodel, similarity);
List<RecommendedItem> irecommendations = irecommender.mostSimilarItems(item, amount);

The CityBlockSimilarity() class worked great on a small data set, but as soon as I switched to a large data set it was no longer reliable.

Is there a different class I need to implement to return recommendations for an item based off of other items that users also have in common?

1

1 Answers

4
votes

So it turns out the class I needed to implement was the TanimotoCoefficientSimilarity class. Once I changed this, I was seeing the results I wanted to see.

ItemSimilarity similarity = new TanimotoCoefficientSimilarity(datamodel);

I was able to leave everything else the same and it worked great! Here is a link to the TanimotoCoefficientSimilarity class if you want to read more about it.