2
votes

I am trying to parse for loops having this type of syntax:

for(loop = 1:10) {


}

In my grammar I have the rules:

genericString %= lexeme[+(char_("a-zA-Z"))];
intRule %= int_;
commandString %= lexeme[+(char_ - '}')];
forLoop %= string("for")
        >> '('
        >> genericString // variable e.g. c
        >> '='
        >> (intRule | genericString) // variable e.g. i
        >> ':'
        >> (intRule | genericString) // variable e.g. j
        >> ')' >> '{'
        >> (forLoop | commandString)
        >> '}';

While this works for the simple example above, it fails to parse the following nested example:

for(loop = 1:10) {
    for(inner = 1:10) {

    }
}

I am guessing it's due to the parser 'getting confused' with brace placement. I think I need to do something like that presented at http://boost-spirit.com/distrib/spirit_1_7_0/libs/spirit/example/fundamental/lazy_parser.cpp (alas, I found it difficult to follow).

Cheers,

Ben.

EDIT 1:

I am now thinking it would be better to handle the recursion from the commandString (called nestedBlock below) rather than within the forLoop, i.e., something like:

forLoop %= string("for")
        >> '('
        >> genericString // variable e.g. c
        >> '='
        >> (intRule | genericString) // variable e.g. i
        >> ':'
        >> (intRule | genericString) // variable e.g. j
        >> ')' 
        >> nestedBlock;

nestedBlock %= lexeme['{' >> -(char_ - '}' - '{')
                          >> -nestedBlock
                          >> -(char_ - '}' - '{')
                          >> '}'];

which is failing with massive boost::spriti errors. The rules are defined as:

    qi::rule<Iterator, std::string(), ascii::space_type> nestedBlock;
    qi::rule<Iterator, Function(), ascii::space_type> forLoop;

Function is a struct of boost::variants

EDIT 2:

So this is what I now have (which is designed to work with or without nested structures):

commandCollection %= *start;

forLoop %= string("for")
        >> '('
        >> genericString // variable e.g. c
        >> '='
        >> (intRule | genericString) // variable e.g. i
        >> ':'
        >> (intRule | genericString) // variable e.g. j
        >> ')'
        >> '{'
        >>       commandCollection
        >> '}';

start %= loadParams  | restoreGenomeData | openGenomeData | initNeat | initEvo |
                 initAllPositions | initAllAgents | initCoreSimulationPointers |
                 resetSimulationKernel | writeStats | restoreSimState |
                 run | simulate | removeObjects | setGeneration |
                 setParam | getParam | pause | create | reset |
                 loadAgents | getAgent | setAgent | listParams | loadScript | forLoop
                 | wait | commentFunc | var | add | sub | mult | div | query;

And I declare the commandCollection rule as follows:

qi::rule<Iterator, boost::fusion::vector<Function>, ascii::space_type> commandCollection;

I assumed that this would do as I expect. The commandCollection is defined as 0 or more commands which should be stored in a boost::fusion::vector. However, when I come to extract the vector from the Function() struct (bearing in mind the start rule uses a Function() iterator), the type for some reason is not identified as a boost::fusion::vector so cannot be extracted. I'm not sure why...

However, if I were to just have

commandCollection %= start;

and decalre the rule as

qi::rule<Iterator, Function(), ascii::space_type> commandCollection;

and then try to extract the data as a single Function() struct, it works fine. But I would like it to store multiple commands (i.e. *start) in some kind of container. I also tried with a std::vector but this also failed.

1
And what is the question?Arne Mertz
To me the question is clear.sehe
@sehe -- I only clarified the question in the title after Arne's remark :-)Ben J
@BenJ The whitespace is preskipped, even before you start the commandString rule.sehe
@BenJ To your "Edit 2": you should make that another question, I feel. Could you adapt the SSCCE to exhibit the problem? (I feel you're simply confusing fusion::vector<Function>() with std::vector<Function()>)sehe

1 Answers

3
votes

You command-string doesn't like the empty body on the inner loop.

Fix it by changing + to * here:

commandString %= lexeme[*(char_ - '}')];

Or, if you prefer to match an optional block, instead of a potentially empty block, consider the fix mentioned by @llonesmiz.

Test case:

#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>
// #include <boost/spirit/include/phoenix.hpp>

namespace qi    = boost::spirit::qi;
namespace karma = boost::spirit::karma;
namespace phx   = boost::phoenix;

typedef boost::variant<int, std::string> Value;
typedef std::pair<Value, Value> Range;
typedef std::pair<std::string, Range> Iteration;

typedef Iteration attr_t;

template <typename It, typename Skipper = qi::space_type>
    struct parser : qi::grammar<It, attr_t(), Skipper>
{
    parser() : parser::base_type(start)
    {
        using namespace qi;

        genericString %= lexeme[+(char_("a-zA-Z"))];// variable e.g. c
        intRule %= int_;
        commandString %= lexeme[*(char_ - '}')];
        value = intRule | genericString;
        range = value >> ':' >> value;
        forLoop %= lit("for")
                >> '(' >> genericString >> '=' >> range >> ')' 
                >> '{'
                >>      (forLoop | commandString)
                >> '}';

        start = forLoop;

        BOOST_SPIRIT_DEBUG_NODES(
                (start)(intRule)(genericString)(commandString)(forLoop)(value)(range)
                 );
    }

  private:
    qi::rule<It, std::string(), Skipper> genericString, commandString;
    qi::rule<It, int(), Skipper> intRule;
    qi::rule<It, Value(), Skipper> value;
    qi::rule<It, Range(), Skipper> range;
    qi::rule<It, attr_t(), Skipper> forLoop, start;
};

bool doParse(const std::string& input)
{
    typedef std::string::const_iterator It;
    auto f(begin(input)), l(end(input));

    parser<It, qi::space_type> p;
    attr_t data;

    try
    {
        bool ok = qi::phrase_parse(f,l,p,qi::space,data);
        if (ok)   
        {
            std::cout << "parse success\n";
        }
        else      std::cerr << "parse failed: '" << std::string(f,l) << "'\n";

        if (f!=l) std::cerr << "trailing unparsed: '" << std::string(f,l) << "'\n";
        return ok;
    } catch(const qi::expectation_failure<It>& e)
    {
        std::string frag(e.first, e.last);
        std::cerr << e.what() << "'" << frag << "'\n";
    }

    return false;
}

int main()
{
    bool ok = doParse(
            "for(loop = 1:10) {\n"
            "   for(inner = 1:10) {\n"
            "   }\n"
            "}"
            );
    return ok? 0 : 255;
}

I heartily recommend looking at the DEBUG output that showed how parsing failed:

<forLoop>
  <try>\n   }\n}</try>
  <fail/>
</forLoop>
<commandString>
  <try>\n   }\n}</try>
  <fail/>
</commandString>
<fail/>