I am using Eigen's LSCG to iteratively solve an over-determined system which I have expressed using sparse matrices and I believe is too slow. By iteratively I mean something like:
//The following is pseudo code
main() {
//compute A
A=..
//compute B
B=..
while(someCondition)
{
x=solve(A,B)
//based on x alter A and B
A=..
B=..
}
}
For example when A has 36k rows and 18k cols with 145k nnz elements and B has 36k rows 3 cols and 110k nnz elements (notice that B is in fact dense) it takes my desktop 74s to execute x=solve(A,B).
- Are these times normal? I guess the answer depends on the machine the code is being ran on. I have an AMD FX 6300 so I suppose the hardware is not the main problem.
- If it is indeed slow what might be the problem? Does the fact that B matrix is not in reality dense slow the solving step down?
To test the time on your machine I have written some simple test code:
#include <Eigen/Sparse> //system solving and Eigen::SparseMatrix
#include <ctime> //measure time to execute
#include <unsupported/Eigen/SparseExtra> //loadMarket
using SpMatrix = Eigen::SparseMatrix<double>;
using Matrix = Eigen::MatrixXd;
int main() {
//load A and B
SpMatrix A, B;
Eigen::loadMarket(A, "/AMatrixDirectory/A.mtx");
Eigen::loadMarket(B, "/BMatrixDirectory/B.mtx");
std::clock_t start;
start = std::clock();
Eigen::LeastSquaresConjugateGradient<SpMatrix> solver;
solver.compute(A);
Matrix x = solver.solve(B);
std::cout << "Time: " << (std::clock() - start) / (double)(CLOCKS_PER_SEC)
<< " s" << std::endl;
return 0;
}
The above is one iteration of the while loop in the pseudo code above. To run the above code you will need:
- Eigen
- C++11 (if not replace using with typedefs)
- The matrices A and B which I have uploaded here
Edit
ggael proposed to use SimplicialLDLT claiming that it has a better performance compared to LSCG in my problem.
For the purpose of comparing Eigen's solvers performance to a particular problem Eigen offers the BenchmarkRoutine . Unfortunately I was not able to use it since only square matrices can be used with it.
I edited the code comparing LSCG and SimplicialLDLT (please do not get discouraged by the length of the code since it consists of 4 blocks similar to each other with only some minor changes):
#include <Eigen/Sparse> //system solving and Eigen::SparseMatrix
#include <ctime> //measure time to execute
#include <unsupported/Eigen/SparseExtra> //loadMarket
// Use typedefs instead of using if c++11 is not supported by your compiler
using SpMatrix = Eigen::SparseMatrix<double>;
using Matrix = Eigen::MatrixXd;
int main() {
// Use multi-threading. If you don't have OpenMP on your system
// it will simply use 1 thread and it will not crash. So do not worry about
// that.
Eigen::initParallel();
Eigen::setNbThreads(6);
// Load system matrices
SpMatrix A, B;
Eigen::loadMarket(A, "/home/iason/Desktop/Thesis/build/A.mtx");
Eigen::loadMarket(B, "/home/iason/Desktop/Thesis/build/B.mtx");
// Initialize the clock which will measure the solvers
std::clock_t start;
{
// Reset clock
start = std::clock();
// Solve system using LSCG
Eigen::LeastSquaresConjugateGradient<SpMatrix> LSCG_solver;
LSCG_solver.setTolerance(pow(10, -10));
LSCG_solver.compute(A);
// Use auto specifier to hold the return value of solve. Eigen::Solve object
// is being returned
auto LSCG_solution = LSCG_solver.solve(B);
std::cout << "LSCG Time Using auto: "
<< (std::clock() - start) / (double)(CLOCKS_PER_SEC) << " s"
<< std::endl;
}
{
// Reset clock
start = std::clock();
// Solve system using LSCG
Eigen::LeastSquaresConjugateGradient<SpMatrix> LSCG_solver;
LSCG_solver.setTolerance(pow(10, -10));
LSCG_solver.compute(A);
// Use Matrix specifier instead of auto. Implicit conversion taking place(?)
Matrix LSCG_solution = LSCG_solver.solve(B);
std::cout << "LSCG Time Using Matrix: "
<< (std::clock() - start) / (double)(CLOCKS_PER_SEC) << " s"
<< std::endl;
}
{
// Reset clock
start = std::clock();
// Solve system using SimplicialLDLT
Eigen::SimplicialLDLT<SpMatrix> SLDLT_solver;
SLDLT_solver.compute(A.transpose() * A);
auto SLDLT_solution = SLDLT_solver.solve(A.transpose() * B);
std::cout << "SimplicialLDLT Time Using auto: "
<< (std::clock() - start) / (double)(CLOCKS_PER_SEC) << " s"
<< std::endl;
}
{
// Reset clock
start = std::clock();
// Solve system using SimplicialLDLT
Eigen::SimplicialLDLT<SpMatrix> SLDLT_solver;
SLDLT_solver.compute(A.transpose() * A);
// Use Matrix specifier instead of auto. Implicit conversion taking place(?)
Matrix SLDLT_solution = SLDLT_solver.solve(A.transpose() * B);
std::cout << "SimplicialLDLT Time Using Matrix: "
<< (std::clock() - start) / (double)(CLOCKS_PER_SEC) << " s"
<< std::endl;
}
return 0;
The above code produces the following result:
LSCG Time Using auto: 0.000415 s
LSCG Time Using Matrix: 52.7668 s
SimplicialLDLT Time Using auto: 0.216593 s
SimplicialLDLT Time Using Matrix: 0.239976 s
As the results proove when I use Eigen::MatrixXd instead of auto I get the situation ggael described in one of his comments, namely LSCG not achieving a solution without setting a higher tolerance and SimplicialLDLT being much faster.
- Are the solvers actually solving the system when I use auto?
- Is the Solver object being implicitly converted to a Matrix Object when I use the Matrix specifier? Since When using LSCG the only change in the first two cases is the use of auto and Matrix respectively,does this implicit conversion to Matrix take 52.7668-0.000415 seconds?
- Is there a faster way to extract the solution Matrix from the Solve object?
solver.setTolerance(1e-14);prior to solving fixes the issue. Here the computation time reduces to 6s. - ggaelauto, you get an abstract expression of the solution which has not been computed yet. See this. - ggael