Support Vector Machines (SVM) for large/very large datasets

Question

I'm wondering what is the state-of-the-art efficient (approximate) implementation of Support Vector Machines (SVM) for large/very large datasets (5-15M+ rows), with non linear decision boundary (such as gaussian kernel)?

I'm aware of two particular approaches: On the one hand, this survey that uses stochastic gradient descent, etc.: http://cseweb.ucsd.edu/~akmenon/ResearchExam.pdf

On the other hand, there are these core vector machines/ball vector machines approaches: http://www.c2i.ntu.edu.sg/ivor/cvm.html

on which page we may find two papers that describe both core and ball vector machines.

In other words, I believe SVMs is quite plausible for the problem in hand, but I'm limited by sample size, if I were to use the standard SVM implementation (could be up to n^3 complexity). I'm looking for an "approximate" implementation that is reasonably accurate while being below n^2 in time complexity. What are the fastest such implementations? Do they work well empirically or close to the original SVM in accuracy?

The question is a little vague. Can you explain it further please? Do you want more information on each approach? or Are you looking for benchmark between them? — Pedrom
The standard quadratic programming approach could take up to n^3 complexity. For large datasets, this is not plausible.I'm looking for most efficient implementation(s) of SVMs on large datasets, while maintaining reasonable accuracy (still sufficiently close to the original SVM implementation). A benchmark comparison of such approximate SVM implementations would be greatly appreciated. Will update the question for better clarification. — user2551507
Indeed SVM has a complexity of N^3, the thing is you already answered that question with the links provided. And if you read the long paper version of Pegasos SVM (one of the references from the first link) you will have a benchmark of the state of art in SVM approximation methods using stochastic gradient descent. In fact you can find a response for both questions in the results section (pag 16) of the long version of the PegasosSVM paper (ttic.uchicago.edu/~nati/Publications/PegasosMPB.pdf) — Pedrom
Thanks a lot for the help; I really appreciate it. However, the paper you showed was published in year 2007 (from a quick search it doesnt seem to mention the core/ball VMs). And the survey paper I linked was written in 2009. 4 years is a considerable amount of time. Even if the complexity may not be improved much, the accuracy of approximation might. I'm hoping for up-to-date answers. — user2551507
Hi, I am agreed that 4 years is a considerable amount of time, but keep in mind that in research is the average time from when a paper is released to when people using it on production start showing results, or being implemented in a mainstream library. So I would not be surprised if those papers are the most recent one that you can get. — Pedrom

user__42 user__42 · Accepted Answer · 2014-02-20T13:41:00

I once tried FaLK-SVM and the results where promising. The approch is similar to core vector machines/ball vector machines but uses k-nearest neighbour with trees (cover-trees) for the sepereation of the data. There is a libSVM implementation on the link. The corresponding paper describes the core and ball aproach but states k-nearest neigbour (just for separation!) to be better.

Support Vector Machines (SVM) for large/very large datasets

1 Answers