Boost spirit lex write token value back to input stream

Question

I'm wondering if there's a way in boost::spirit::lex to write a token value back to the input stream (possibly after editing) and rescanning again. What I'm basically looking for is a functionality like that offered by unput() in Flex.

Thanks!

What are you trying to achieve? I mean, in what context would you need to use unput()? If you show an example, I might be able to show you how I'd do it (possibly using Lexer states) — sehe
Basically, I need the lexer to match an identifier directly followed by an open paren "abc(" as one token, and put it back into the input stream with the paren being at the beginning of the string like "(abc ". The next step would be for the lexer to scan it again but as two separate tokens (a paren token then an identifier token). — Haitham Gad
Ok, I posted my take at this, let me know if I understood the goal incorrectly. — sehe

sehe sehe · Accepted Answer · 2012-05-16T20:07:46

Sounds like you just want to accept tokens in different orders but with the same meaning.

Without further ado, here is a complete sample that shows how this would be done, exposing the identifier regardless of input order. Output:

Input 'abc(' Parsed as: '(abc'
Input '(abc' Parsed as: '(abc'

Code

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/lex_lexertl.hpp>
#include <iostream>
#include <string>

using namespace boost::spirit;

///// LEXER
template <typename Lexer>
struct tokens : lex::lexer<Lexer>
{
    tokens()
    {
        identifier = "[a-zA-Z][a-zA-Z0-9]*";
        paren_open = '(';

        this->self.add
            (identifier)
            (paren_open)
            ;
    }

    lex::token_def<std::string> identifier;
    lex::token_def<lex::omit> paren_open;
};

///// GRAMMAR
template <typename Iterator>
struct grammar : qi::grammar<Iterator, std::string()>
{
    template <typename TokenDef>
        grammar(TokenDef const& tok) : grammar::base_type(ident_w_parenopen)
    {
        ident_w_parenopen = 
              (tok.identifier >> tok.paren_open)
            | (tok.paren_open >> tok.identifier) 
            ;
    }
  private:
    qi::rule<Iterator, std::string()> ident_w_parenopen;
};

///// DEMONSTRATION
typedef std::string::const_iterator It;

template <typename T, typename G>
void DoTest(std::string const& input, T const& tokens, G const& g)
{
    It first(input.begin()), last(input.end());

    std::string parsed;
    bool r = lex::tokenize_and_parse(first, last, tokens, g, parsed);

    if (r) {
        std::cout << "Input '" << input << "' Parsed as: '(" << parsed << "'\n";
    }
    else {
        std::string rest(first, last);
        std::cerr << "Parsing '" << input << "' failed\n" << "stopped at: \"" << rest << "\"\n";
    }
}

int main(int argc, char* argv[])
{
    typedef lex::lexertl::token<It, boost::mpl::vector<std::string> > token_type;
    typedef lex::lexertl::lexer<token_type> lexer_type;
    typedef tokens<lexer_type>::iterator_type iterator_type;

    tokens<lexer_type> tokens;
    grammar<iterator_type> g (tokens);

    DoTest("abc(", tokens, g);
    DoTest("(abc", tokens, g);
}

Boost spirit lex write token value back to input stream

2 Answers