3
votes

I'm struggling to find a way to get the equation for a linear SVM model in the regression case, since most of the questions deal with classification... I have fit it with caret package.

1- univariate case

set.seed(1)
fit=train(mpg~hp, data=mtcars, method="svmLinear")
plot(x=mtcars$hp, y=predict(fit, mtcars), pch=15)
points(x=mtcars$hp, y=mtcars$mpg, col="red")
abline(lm(mpg~hp, mtcars), col="blue")

Which gives the plot with red=actual, black=fitted, and blue line is classic regression. In this case I know I could manually calculate the SVM prediction line from 2 points, but is there a way to get directly the equation from the model structure? I actually need the equation like this y=a+bx (here mpg=?+?*hp ) with values in the original scale.

enter image description here

2-multivariate

same question but with 2 dependent variables (mpg~hp+wt) Thanks,

1

1 Answers

4
votes

Yes, I believe there is. Take a look at this answer, which is similar, but does not use the caret library. If you add svp = fit$finalModelto the example, you should be able to follow it almost exactly. I applied a similar technique to your data below. I scaled the data to fit nicely on the plot of the vectors since the library scales the data at runtime.

require(caret)

set.seed(1)
x = model.matrix(data=mtcars, mpg ~ scale(hp)) #set up data
y = mtcars$mpg

fit=train(x, y, method="svmLinear") #train

svp = fit$finalModel #extract s4 model object

plot(x, xlab="", ylab="")
w <- colSums(coef(svp)[[1]] * x[unlist(alphaindex(svp)),])
b <- b(svp)
abline(b/w[1],-w[2]/w[1], col='red')
abline((b+1)/w[1],-w[2]/w[1],lty=2, col='red')
abline((b-1)/w[1],-w[2]/w[1],lty=2, col='red')

enter image description here

And your second question:

x = model.matrix(data=mtcars, mpg ~ scale(hp) + scale(wt) - 1) #set up data
fit=train(x, y, method="svmLinear") #train

svp = fit$finalModel #extract s4 model object

plot(x, xlab="", ylab="")
w <- colSums(coef(svp)[[1]] * x[unlist(alphaindex(svp)),])
b <- b(svp)
abline(b/w[1],-w[2]/w[1], col='red')
abline((b+1)/w[1],-w[2]/w[1],lty=2, col='red')
abline((b-1)/w[1],-w[2]/w[1],lty=2, col='red')

enter image description here

Edit

The above answer concerns plotting a boundary, not the linear SVM regression line. To answer the question, one easy way to get the line is to extract the predicted values and plot the regression. You actually only need a couple of points to get the line, but for simplicity, I used the following code.

abline(lm(predict(fit, newdata=mtcars) ~ mtcars$hp), col='green')

or

abline(lm(predict(fit) ~ mtcars$hp), col='green')

enter image description here