Data format for regression with package e1071

Question

I'm trying to understand how I can get my data into a format that allows me to do svm regression. I have a time series that looks like

data

[1] 1.20962 1.21036 1.21006 1.20873 1.20658 1.20676 1.20576 1.20555 1.20526 1.20504 1.20516
[12] 1.20581 1.20456 1.20558 1.20496 1.20547 1.20382 1.20312 1.20259 1.20306 1.20137 1.20089

I do a rev and then diff on it

data <- rev(data)
data <- diff(data)
data

[1] -0.00040  0.00092 -0.00095 -0.00045  0.00013  0.00247  0.00055 -0.00058  0.00106  0.00188
[11]  0.00110 -0.00002  0.00069  0.00019 -0.00058  0.00080 -0.00021 -0.00079 -0.00007  0.00123

But it's not in the right format to use with svm

library(e1071)
svm.model <- svm(data=data, type="nu-regression", kernel="radial" )

Error in inherits(x, "Matrix") : argument "x" is missing, with no default

I'm not sure how to manipulate it into a data.frame or another way that it's looking for.

EDIT: I was looking for something like this

data <- matrix(unlist(data), ncol = 2, byrow = TRUE)

svm.model <- svm(x=data[,1],y=data[,2],data=data, type="nu-regression", kernel="radial" )

The svm command needs to know what the dependent and independent variables are, and it looks like you only have one variable. See, e.g., rischanlab.github.io/SVM.html that shows svm on Species ~ .. — mysteRious
I think I need to convert the list into a two column matrix? — kernel-trick
Do you just have a univariate time series you want to forecast? If so, this may be helpful: computationalfinance.lsi.upc.edu/?page_id=242 -- I may not be understanding your intention to do regression. Feels like some part of the data or the question is missing. — mysteRious

dinman dinman · Accepted Answer · 2018-03-21T20:23:00

As is pointed out, the svm needs a dependent and independent variable. You only have one column of data. If you are wanting to use the svm to forecast the time series, you need to have a time column or index. Below is an example:

  library ("e1071")
  foo.data <- matrix (nrow = 22, ncol = 2)
  foo.data [, 1] <- c (1.20962, 1.21036, 1.21006, 1.20873, 1.20658,     1.20676, 1.20576, 1.20555,
           1.20526, 1.20504, 1.20516, 1.20581, 1.20456, 1.20558, 1.20496, 1.20547,
           1.20382, 1.20312, 1.20259, 1.20306, 1.20137, 1.20089)

  foo.data [, 2] <- seq (from = 1, to = 22, by = 1) #you don't have a time/data index

  foo.svm <- svm (foo.data [,1] ~ foo.data [,2], type="nu-regression", kernel="radial" )

  summary (foo.svm)

Data format for regression with package e1071

1 Answers