3
votes

Can someone please help me clarify.

I am currently using collaborative filtering (ALS) which returns a recommendation list with scores corresponding to the recommended items. In addition to this, I am boosting the scores (+0.1) if the items contain a tag that corresponds with what the user has specified they prefer such as "romantic movies". To me, this is considered a hybrid collaborative approach since it's boosting the Collaborative filtering results with content-based filtering (Please correct me if I am wrong).

Now, what if I did the same approach without doing Collaborative filtering? would it be considered Content-based Filtering? since I will be still recommending dishes based on the content and attributes of each dish corresponding to what the user has specified they like (such as "romantic movies").

The reason why I'm confused is because I've seen content-based filtering where they apply an algorithm such as Naive Bayes etc, and this approach would be similar to a simple search of the items (on the contents).

1

1 Answers

5
votes

Not sure you can do what you suggest because you have no score to boost without CF.

You are indeed using a hybrid, much the same as the Universal Recommender. To do purely content-based recommendations you have to implement two methods

  • Personalized recommendations: here you have to look at the content of items the user preferred and find items that have similar content. This can be done by using something like the Mahout spark-rowsimilarity job to create a model of item: list-of-similar-items then indexing the results with a search engine and using the user's preferred item ids as the query. This is being added to the Universal Recommender.
  • "People who liked this also liked these": these are items similar to one being viewed, for example, and are the same for all users. They are not personalized and so are useful even for anonymous users with no history. This can be done with the same indexed ids as above but using the items similar to the one being viewed as the query. One might think to use only the similar items themselves but by using them as a query you can put the categorical boost in the search engine query and have boosted items returned. This already works in the Universal Recommender but the similar items are not in the model yet.

That said mixing content with collaborative-filtering will almost surely give better results since CF works better when the data is available. The only time to rely on content-based recommendations is when your catalog is of one-off items, which never get enough CF interactions or you have rich content, which has a short lifetime like breaking news.

BTW anyone who wants to help add the pure content-based part to the Universal Recommender can contact the new maintainers of it at ActionML.com