1
votes

I am trying to build a recommender system using collaborative filtering. The issues I am facing are :

  1. The User-Item dataset has mostly categorical variables, so cant find the best way to calculate similarity matrix. Euclidean / Cosine distance will not work here, trying with Jaccard distance.
  2. The dataset does not have User rating for items, instead, we have classifiers - "did not buy", "buy", "added to cart but did not buy".

We have used XGB to get the likelihood to buy a particular item by a particular user, but this kind of dataset is not helping for the recommendation.

Can you please suggest any recommendation algorithm (preferably in python) which handles classification and categorical data?

Thanks in advance.

1
You can use the traditional matrix factorization approach such as SVD. With regards to #2, since there's an inherent order to the classes, you can assign numeric values to the classes (e.g. "did not buy" = 0, "added to cart but did not buy" = 1, "buy" = 2). The winning algorithm in Netflix's Kaggle used ALS for minimizing the cost function, though you could use other gradient descent techniques. Maybe take a look at this. - Scratch'N'Purr
Thanks, will try and get back if any other issues. - Suvajit

1 Answers

0
votes

Association rule mining will be helpful here. It calculates the relative likelihood that items will appear together in a user's history. It's different from, and differently useful from, collaborative filtering recommendation techniques.