2
votes

I have a working boost spirit parser and was thinking if it is possible to do iterative update of an abstract syntax tree with boost spirit?

I have a struct similar to:

  struct ast;

  typedef boost::variant< boost::recursive_wrapper<ast> > node;

  struct ast
  {
    std::vector<int> value;
    std::vector<node> children;
  };

Which is being parsed by use of:

bool r = phrase_parse(begin, end, grammar, space, ast);

Would it be possible to do iterative update of abstract syntax tree with boost spirit? I have not found any documentation on this, but I was thinking if the parsers semantic actions could push_back on an already existing AST. Has anyone tried this?

This would allow for parsing like this:

bool r = phrase_parse(begin, end, grammar, space, ast); //initial parsing

//the second parse will be called at a later state given some event/timer/io/something

bool r = phrase_parse(begin, end, grammar, space, ast); //additional parsing which will update the already existing AST
1
Is this is possible? (Yes). Has anyone tried this? (Yes). Have I done this? (Yes). Is the problem underspecified? (Yes)! Have I answered it? Hell yes. Hope you likesehe

1 Answers

0
votes

How would you know which nodes to merge? Or would you always add ("graft") at the root level? In that case, why don't you just parse another and merge moving the elements into the existing ast?

ast& operator+=(ast&& other) {
    std::move(other.value.begin(),    other.value.end(),    back_inserter(value));
    std::move(other.children.begin(), other.children.end(), back_inserter(children));
    return *this;
}

Demo Time

Let's devise the simplest grammar I can think of for this AST:

start = '{' >> -(int_ % ',') >> ';' >> -(start % ',') >> '}';

Note I didn't even make the ; optional. Oh well. Samples. Exercises for readers. ☡ You know the drill.

We implement the trivial function ast parse(It f, It l), and then we can simply merge the asts:

int main() {
    ast merged;
    for(std::string const& input : {
                "{1 ,2 ,3 ;{4 ;{9 , 8 ;}},{5 ,6 ;}}",
                "{10,20,30;{40;{90, 80;}},{50,60;}}",
            })
    {
        merged += parse(input.begin(), input.end());
        std::cout << "merged + " << input << " --> " << merged << "\n";
    }
}

Live On Coliru

//#define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>

namespace qi = boost::spirit::qi;
namespace karma = boost::spirit::karma;

struct ast;

//typedef boost::make_recursive_variant<boost::recursive_wrapper<ast> >::type node;
typedef boost::variant<boost::recursive_wrapper<ast> > node;

struct ast {
    std::vector<int> value;
    std::vector<node> children;

    ast& operator+=(ast&& other) {
        std::move(other.value.begin(),    other.value.end(),    back_inserter(value));
        std::move(other.children.begin(), other.children.end(), back_inserter(children));
        return *this;
    }
};

BOOST_FUSION_ADAPT_STRUCT(ast,
    (std::vector<int>,value)
    (std::vector<node>,children)
)

template <typename It, typename Skipper = qi::space_type>
struct grammar : qi::grammar<It, ast(), Skipper>
{
    grammar() : grammar::base_type(start) {
        using namespace qi;
        start = '{' >> -(int_ % ',') >> ';' >> -(start % ',') >> '}';
        BOOST_SPIRIT_DEBUG_NODES((start));
    }
  private:
    qi::rule<It, ast(), Skipper> start;
};

// for output:
static inline std::ostream& operator<<(std::ostream& os, ast const& v) {
    using namespace karma;
    rule<boost::spirit::ostream_iterator, ast()> r;
    r = '{' << -(int_ % ',') << ';' << -((r|eps) % ',')  << '}';
    return os << format(r, v);
}

template <typename It> ast parse(It f, It l) 
{
    ast parsed;

    static grammar<It> g;
    bool ok = qi::phrase_parse(f,l,g,qi::space,parsed);

    if (!ok || (f!=l)) {
        std::cout << "Parse failure\n";
        std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
        exit(255);
    }

    return parsed;
}

int main() {
    ast merged;
    for(std::string const& input : {
                "{1 ,2 ,3 ;{4 ;{9 , 8 ;}},{5 ,6 ;}}",
                "{10,20,30;{40;{90, 80;}},{50,60;}}",
            })
    {
        merged += parse(input.begin(), input.end());
        std::cout << "merged + " << input << " --> " << merged << "\n";
    }
}

Of course, it prints:

merged + {1 ,2 ,3 ;{4 ;{9 , 8 ;}},{5 ,6 ;}} --> {1,2,3;{4;{9,8;}},{5,6;}}
merged + {10,20,30;{40;{90, 80;}},{50,60;}} --> {1,2,3,10,20,30;{4;{9,8;}},{5,6;},{40;{90,80;}},{50,60;}}

UPDATE

In this - trivial - example, you can just bind the collections to the attributes in the parse call. The same thing will happen without the operator+= call needed to move the elements, because the rules are written to automatically append to the bound container attribute.

CAVEAT: A distinct disadvantage of modifying the target value in-place is what happens if parsing fails. In the version the merged value will then be "undefined" (has received partial information from the failed parse).

So if you want to parse inputs "atomically", the first, more explicit approach is a better fit.

So the following is a slightly shorter way to write the same:

Live On Coliru

// #define BOOST_SPIRIT_DEBUG
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/karma.hpp>

namespace qi = boost::spirit::qi;
namespace karma = boost::spirit::karma;

struct ast;

//typedef boost::make_recursive_variant<boost::recursive_wrapper<ast> >::type node;
typedef boost::variant<boost::recursive_wrapper<ast> > node;

struct ast {
    std::vector<int> value;
    std::vector<node> children;
};

BOOST_FUSION_ADAPT_STRUCT(ast,
    (std::vector<int>,value)
    (std::vector<node>,children)
)

template <typename It, typename Skipper = qi::space_type>
struct grammar : qi::grammar<It, ast(), Skipper>
{
    grammar() : grammar::base_type(start) {
        using namespace qi;
        start = '{' >> -(int_ % ',') >> ';' >> -(start % ',') >> '}';
        BOOST_SPIRIT_DEBUG_NODES((start));
    }
  private:
    qi::rule<It, ast(), Skipper> start;
};

// for output:
static inline std::ostream& operator<<(std::ostream& os, ast const& v) {
    using namespace karma;
    rule<boost::spirit::ostream_iterator, ast()> r;
    r = '{' << -(int_ % ',') << ';' << -((r|eps) % ',')  << '}';
    return os << format(r, v);
}

template <typename It> void parse(It f, It l, ast& into) 
{
    static grammar<It> g;
    bool ok = qi::phrase_parse(f,l,g,qi::space,into);

    if (!ok || (f!=l)) {
        std::cout << "Parse failure\n";
        std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
        exit(255);
    }
}

int main() {
    ast merged;
    for(std::string const& input : {
            "{1 ,2 ,3 ;{4 ;{9 , 8 ;}},{5 ,6 ;}}",
            "{10,20,30;{40;{90, 80;}},{50,60;}}",
            })
    {
        parse(input.begin(), input.end(), merged);
        std::cout << "merged + " << input << " --> " << merged << "\n";
    }
}

Still prints