Sorry, I'am newbie at recommender systems, but i wrote few lines of code using apache mahout lib. Well, my dataset is pretty small, 500x100 with 8102 cells known.
So, my dataset is actually a subset of Yelp dataset from "Yelp business rating prediction" competition. I just take top 100 most commented restaurants, and then take 500 most active customers.
I created SVDRecommender and then I evaluated RMSE. And so the result is about 0.4... Why is it so small? Maybe i just don't understand something and my dataset is not so sparse, but then i tried with larger and more sparse dataset and RMSE become even smaller (about 0.18)! Could anyone explain me such behaviour?
DataModel model = new FileDataModel(new File("datamf.csv"));
final RatingSGDFactorizer factorizer = new RatingSGDFactorizer(model, 20, 200);
final Factorization f = factorizer.factorize();
RecommenderBuilder builder = new RecommenderBuilder() {
public Recommender buildRecommender(DataModel model) throws TasteException {
//build here whatever existing or customized recommendation algorithm
return new SVDRecommender(model, factorizer);
}
};
RecommenderEvaluator evaluator = new RMSRecommenderEvaluator();
double score = evaluator.evaluate(builder,
null,
model,
0.6,
1);
System.out.println(score);