2
votes

I've got a situation where I'm parsing input into a map of vectors boost::Spirit.

It works well, but I've run into a situation where I need to provide a fixed string internally for a certain case of user input, so that the internal string is parsed in lieu of the user input.

Here's a sample of the grammar:

input_pair = 
    key( std::string("__input")) >> valid_input % ','
    ;

where:

key = 
    qi::attr( _r1 )
    ;

and "valid_input" is a simple rule that tests for specific strings / character patterns, etc.

This grammar pairs the input with a pre-defined internal key and stores it in a map as a vector. If the input contains the specified delimiter, input is appropriately parsed into separate elements of the receiving vector.

The problem I've encountered, however, is providing a pre-defined string for "valid_input".

My first inclination is to do what I did with the map key:

input_pair = 
    key( std::string("__input")) >> key ( std::string("A, B, C")) % ','
    ;

But, of course,the entire string is inserted as the first element of the vector. That is, the comma delimiters in "A, B, C" are not recognized as I'd hoped they would.

Thus my question:

Given a boost::Spirit parser whose grammar parses input into a map of vectors, is there any way to parse a fixed string defined within the parser itself into a vector?

1

1 Answers

2
votes

Given input "1 2 3 4", the following rule:

typedef std::vector<std::string> attr_t;

rule = *qi::lexeme[+qi::char_];

will (obviously) parse an attribute value of std::vector<string> { "1", "2", "3", "4" }.

Now change the rule to

rule %= qi::attr(attr_t { "aap", "noot", "mies" }) >>  *qi::lexeme[+qi::char_];

and you'll receive std::vector<string> { "aap", "noot", "mies", "1", "2", "3", "4" }. Use qi::omit if you want the hardcoded vector only.

This is C++11 uniform initialization syntax at work, so if you can't use that, you'd have to supply a vector some other way:

static const std::string hardcoded[] = { "aap", "noot", "mies" };
static const attr_t const_data(hardcoded, hardcoded+3);
rule = qi::attr(const_data) >>  *qi::lexeme[+qi::char_];

This is also slightly more efficient (at the cost of more verbose code).


Here is fully working example with some variations of this (assuming C++11 compiler for the test main):

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>

namespace qi    = boost::spirit::qi;
namespace karma = boost::spirit::karma;

typedef std::vector<std::string> attr_t;

int main()
{
    const std::string input = "1 2 3 4";
    auto f(std::begin(input)), l(std::end(input));

    qi::rule<decltype(f), attr_t(), qi::space_type> rule;

    // #1: no hardcoded values
    rule = *qi::lexeme[+qi::char_];

    // #2: hardcoded inline temp vector
    rule = qi::attr(attr_t { "aap", "noot", "mies" }) >>  *qi::lexeme[+qi::char_];

    // #3: hardcoded static vector, C++03 style
    static const std::string hardcoded[] = { "aap", "noot", "mies" };
    static const attr_t const_data(hardcoded, hardcoded+3);
    rule = qi::attr(const_data) >>  *qi::lexeme[+qi::char_];

    try
    {
        attr_t data;
        bool ok = qi::phrase_parse(f, l, rule, qi::space, data);
        if (ok)   
        {
            std::cout << "parse success\n";
            std::cout << "data: " << karma::format_delimited(karma::auto_, ' ', data) << "\n";
        }
        else std::cerr << "parse failed: '" << std::string(f,l) << "'\n";

        if (f!=l) std::cerr << "trailing unparsed: '" << std::string(f,l) << "'\n";

        return ok? 0 : 255;
    } catch(const qi::expectation_failure<decltype(f)>& e)
    {
        std::string frag(e.first, e.last);
        std::cerr << e.what() << "'" << frag << "'\n";
        return 255;
    }

}

It prints

parse success
data: 1 2 3 4