3
votes

I am trying to build a classifier using sklearn.svm.SVC but I would like to train the kernel separately on different subsets of features to better represent the feature space (as described here).

I have read the User Guide page and I understand that I can create kernels that are sums of individual kernels or feed into the SVC a precomputed kernel (kernel = 'precomputed'), but I do not understand how I apply different kernels to different features? Is there a way to implement this in sklearn?

I have found a way to calculate kernels in sklearn (https://scikit-learn.org/stable/modules/gaussian_process.html#gp-kernels), and so I could calculate the kernel on each set separately. However, once I output the distance matrix, I am not sure how I would use it to train the SVM.

Do I have to create a custom kernel like:

if feature == condition1:
   use kernel X
else:
   use kernel Y

and add it to the SVM?

Or is there any other python libraries I could be using for this?

2

2 Answers

1
votes

You are referring to the problem of Multiple Kernel Learning (MKL). Where you can train different kernels for different groups of features. I have used this in a multi-modal case, where I wanted different kernels for image and text.

I am not sure if you actually can do it via scikit-learn.

There are some libraries provided on GitHub, for example, this one: https://github.com/IvanoLauriola/MKLpy1

Hopefully, it can help you to achieve your goal.

0
votes

Multiple kernel learning is possible in sklearn. Just specify kernel='precomputed' and then pass the kernel matrix you want to use to fit.

Suppose your kernel matrix is the sum of two other kernel matrices. You can compute K1 and K2 however you like and use SVC.fit(X=K1 + K2, y=y).