3
votes

I want to classify using libsvm. I have 9 training sets , each set has 144000 labelled instances , each instance having a variable number of features. It is taking about 12 hours to train one set ( ./svm-train with probability estimates ). As i dont have much time , I would like to run more than one set at a time. I'm not sure if i can do this.. Can i run all 9 processes simultaneously in different terminals ?

./svm-train -b 1 feat1.txt
./svm-train -b 1 feat2.txt
      .
      .
      .
./svm-train -b 1 feat9.txt

( i'm using fedora core 5 )

3

3 Answers

7
votes

You can tell libsvm to use openmp for parallelization. Look at this libsvm faq entry: http://www.csie.ntu.edu.tw/~cjlin/libsvm/faq.html#f432

3
votes

As Adam said, it depends on how many cores and processors your system has available. If that's insufficient, why not spin up a few EC2 instances to run on?

The Infochimps MachetEC2 public AMI comes with most of the tools you'll need: http://blog.infochimps.org/2009/02/06/start-hacking-machetec2-released/

2
votes

Yes. But unless you have a multi-core or multi-processor system it may not save you that much time.