2
votes

Content based filtering (CBF): It works on basis of product/ item attributes. Say user_1 has placed order(or liked) for some of the items in the past. Now we need to identify relevant features of those ordered items and compare them with other items to recommend any new one. One of the famous model to find the similar items based on feature set is Random forest or decision tree

Collaborative filtering (CLF): It uses user behavior . Say user_1 has placed order(or liked) for some of the items in the past. Now we find similar user. Users who ordered/likes the same items in the past can be considered similar user. Now we can recommend some of the items ordered by similar user based on scores. One of the famous model to find similar user is KNN

Question : Say I have to find similar users not on based of their behavior (like I mentioned) in CBF but based on some user profile features like nationality/height/weight/language/salary etc will it be considered CBF or CLF ?

Second related doubt I have is both CBF or CLF will not work for the new user in system as he has not done any activity in the system. Is that correct ? same is the case when system is new or launched as we won't have much data here ?

1

1 Answers

0
votes

You can think content based approach as regression problem wherein you have your x_i's as your data points and their corresponding y_i's as rating given by the user. You have correctly stated the CLF, it uses an user-item matrix from which it creates item-item or user-user matrices and then recommends products/items based on these matrices.

But in content-based you need to build a vector corresponding to each user. e.g. lets say we want to create a vector for a netflix user. This vector can include features like how many movies this user has watched, what genere of movies he/she likes, is he a critical user, etc. some of the features you have mentioned like his average salary and others and this vector will have an y_i which will the rating. These kinds of recommendation systems are known as content based and this answers your first question.

Coming to your second question, wherein when a new user/item comes into the picture, then how does one recommend items to that user. This problem is known as cold start problem. In that case you can use the geographical location of that user to pick the top items that are watched by the people in his country and recommend based on that. Once he starts rating those top items, then both your CLF and Content based can work as they normally work.