3
votes

I'm working on a project where I need to implement an article/news recommendation engine. I'm thinking of combining different methods (item-based, user based, model CF) and have a question regarding the tool to use.

From my research Lucene is definitely the tool for text processing but for the recommendation part, it's not so clear. If I want to implement an item CF on articles based on text similarity : - I've seen case studies using Mahout but also solr (http://fr.slideshare.net/lucenerevolution/building-a-realtime-solrpowered-recommendation-engine), as it's really close to a search problem I would think that solr is maybe better, am I right ? - What are the differences in term of time processing between the 2 tools (I think Mahout is more batch and solr real time) ? - Can I get a text distance directly from Lucene (it's not really clear for me what is the added value of solr compared to Lucene) ? - For more advanced method (model based on matrix factorization), I would use Mahout but is there any SVD-like function in solr for concept/tag discovering ?

Thanks for your help.

1

1 Answers

0
votes

it depends on your requirements, if you only need offline recommendaton function, mahout is good. for online, i am testing it too. In fact, I have tested with lucene and mahout, they work fine together. for solr, im not so sure, all i know it uses lucene as its core. so all the heavy liftings are still done by lucene. In my case, I combined mahout and lucene in my java program, basically lucene does preprocessing and primitive similarity calculations and then the result is sent to mahout to be further analysed.