1
votes

I am iteratively building a custom recommender system based on a frequently changing probabilistic latent factor model. I have already written some Java code that implements the model. It factorises the user-item rating matrix into two matrices UxK (user feature vectors) and IxK (item feature vectors) to estimate the missing ratings.

I am looking for the simplest way to plug (perhaps by rewriting) my code into a framework to build a recommender system, a baseline, and be able to compare these against each other in a standard way - e.g. cross validation to calculate precision, recall, RMSE... As my system still lacks this, the framework should provide methods to calculate and make recommendations based on the estimated user-item rating matrix.

It looks like Mahout should do the job. However, its documentation says "It does not currently support model-based recommenders.". Can anybody tell me whether what I am trying to achieve is possible with Mahout and whether it is worth spending the time to learn how to use it. If Mahout is not suitable, can you suggest any alternatives?

Many thanks!

2

2 Answers

3
votes

I'd say you are better off asking the nice fellows in the Mahout mailing list

That said, Mahout provides SVD based recommenders that use different factorizers for the matrices calculations. For instance, there's the ALSWRFactorizer that supports 2 modes:

  1. Factorizing of a explicit feedback rating matrix. See paper
  2. Factorizing implicit feedback variant. See paper

It should be easy to extend functionality by implementing your own recommender (extend AbstractRecommender) or by implementing your own factorizer (extend AbstractFactorizer). Nonetheless, without knowing more about your approach or your implementation I cannot really say more.

0
votes

There are two classes of recommenders: based on data(generates recommendations for specific users e.g SVD) based on model(generates model to build recommendations by user data e.g. RBM)

Mahout does not support model-based recommenders(doesnt have appropriate interfaces to do so) You can implement some algorithms, but out-of-the-box you won't be able to use some features of model-based approach.

By the way I'd prefer MyMediaLite(if your dataset is small enough to avoid Hadoop). MML supports ensembles and there are more algorithms.