5
votes

Is it possible (using Boost::Spirit::QI) to parse numbers from a comma separated string so that I get the index of each parsed number?

Suppose I have a string "23,123,65,1" and I want to insert each of these numbers into a matrix at given locations (0, 1, 2, 3). One way to do it would to parse the numbers into an std::vector and then copy them to the matrix row, but it isn't particularly fast.

Currently I'm using the vector variant:

Matrix data(10, 4);
int row = 0;
int col = 0;
std::string str = "23,123,65,1";
std::vector<double> res;
if (qi::parse(str.begin(), str.end(), qi::double_ % ',', res))
{
  std::for_each(res.begin(), res.end(), [&col, &data, &row](double elem) {

      data(row, col) = elem;
      col++;
});
}

It'd be awesome if the parser had a success callback that takes a lambda function or a similar feature.

1

1 Answers

12
votes

There are a number of approaches.

  • What I'd usually recommend instead, is using well thought out repeat(n) expressions with directly exposed container attributes (like vector<vector<double> >).

  • What you seem to be looking for is semantic actions with state. (This is common practice coming from lex/yacc).

I treat these approaches in three full demos below (1., 2. and 3.)

  • An advanced technique is using customization points to allow Spirit to directly treat your Matrix type as a container attribute and override the insertion logic for it using spirit::traits. For this approach I refer to this answer: pass attribute to child rule in boost spirit.

Using inherited attributes

Here is a relatively straightforward approach:

  1. parsing directly into a vector<vector<double> > (full code live online)

    qi::rule<It, Matrix::value_type(size_t cols), qi::blank_type> row;
    qi::rule<It, Matrix(size_t rows,size_t cols), qi::blank_type> matrix;
    
    row    %= skip(char_(" \t,")) [ repeat(_r1) [ double_ ] ];
    matrix %= eps // [ std::cout << phx::val("debug: ") << _r1 << ", " << _r2 << "\n" ]
           >> repeat(_r1) [ row(_r2) >> (eol|eoi) ];
    

    Usage:

    if (qi::phrase_parse(f,l,parser(10, 4),qi::blank, m))
        std::cout << "Wokay\n";
    else
        std::cerr << "Uhoh\n";
    
  2. Similarly, but adapting a Matrix struct (full code live here)

    struct Matrix
    {
        Matrix(size_t rows, size_t cols) : _cells(), _rows(rows), _cols(cols) { }
    
        double       & data(size_t col, size_t row)       { return _cells.at(row).at(col); } 
        const double & data(size_t col, size_t row) const { return _cells.at(row).at(col); } 
    
        size_t columns() const { return _cols; }
        size_t rows()    const { return _rows; }
    
        std::vector<std::vector<double> > _cells;
        size_t _rows, _cols;
    };
    
    BOOST_FUSION_ADAPT_STRUCT(Matrix, (std::vector<std::vector<double> >,_cells))
    

    Usage

    Matrix m(10, 4);
    
    if (qi::phrase_parse(f,l,parser(m.rows(),m.columns()),qi::blank, m))
        std::cout << "Wokay\n";
    else
        std::cerr << "Uhoh\n";
    

Using semantic actions/qi::locals

3. This is more work, but potentially more flexible. You'd define a polymorphic callable type to insert a value at a given cell:

struct MatrixInsert
{
    template <typename, typename, typename, typename> struct result { typedef bool type; };
    template <typename Matrix, typename Row, typename Col, typename Value>
        bool operator()(Matrix &m, Row& r, Col& c, Value v) const
        {
            if (r < m.rows() && c < m.columns())
            {
                m.data(r, c++) = v;
                return true; // parse continues
            }
            return false;    // fail the parse
        }
};

BOOST_PHOENIX_ADAPT_CALLABLE(matrix_insert, MatrixInsert, 4)

The last line makes this a phoenix lazy function, so you can use it without weird bind syntax in your semantic actions:

qi::rule<It, Matrix(), qi::blank_type, qi::locals<size_t /*_a: row*/, size_t/*_b: col*/> > matrix;
matrix = eps    [ _a = 0 /*current row*/ ]
     >> (
            eps     [ _b = 0 /*current col*/ ] 
         >> double_ [ _pass = matrix_insert(_val, _a, _b, _1) ] % ','
        ) % (eol    [ ++_a /*next row*/])
     ;

Full code is, again live on liveworkspace.org