2
votes

How does one use boost::spirit with an input that consists of something other than characters?

In my case, I have a std::vector< AbstractBaseClass> that I would like to treat as a token stream into my grammar, where each AbstractBaseClass is a token. Something like:

struct AbstractBaseClass
{
};

struct ConcreteClassA : public AbstractBaseClass
{
};

struct ConcreteClassB : public AbstractBaseClass
{
};


std::vector<AbstractBaseClass> stream;
std::vector<AbstractBaseClass>::iterator iter = stream.begin();
std::vector<AbstractBaseClass>::iterator end = stream.end();
bool r = boost::spirit::qi::parse( iter, end, TOKEN_ID_FOR_CONCRETE_CLASS_A >> TOKEN_ID_FOR_CONCRETE_CLASS_B >> TOKEN_ID_FOR_CONCRETE_CLASS_A );

What methods do I need to add to my classes / what would the token ID's look like to support this?

Presumably I need to provide something analagous to boost::spirit::lex::token_def<> and boost::spirit::lex::token<>.

I have looked into using these directly, but these two classes seem to assume that there is a raw character stream under the lexer token, which is not true in my case; I get the tokens directly.

Edit:

Well, I answered my own question. I'll leave this up in-case anybody else might find it useful. The basics are explained here. There are a handful of caveats.

  • My first attempt was to use boost::variant to describe my tokens. The parser requires that the tokens be convertable to bool. To solve this, I wrapped my boost::variant in boost::optional. Edit: Actually, it seems it's the debugging capability that imposes this requirement. My current solution adds a custom debug handler instead of the stock one that no longer checks if the value of the iterator is "true".
  • Similiarly, the operator<< must be defined, at least if you want debug output.
  • In the parse() method, you need to check if your iterator is not at the end before you dereference it.
  • If you have lots of token types you may need to increase the size of MPL vector and list as described here.
1
If you don't get an answer here try asking in the boost spirit mailing list. It's very active. lists.sourceforge.net/lists/listinfo/spirit-generalSmittii
Or may be if no one is giving you an answer here you should start questioning if using boost::spirit is really a good idea.6502

1 Answers

1
votes

Your self-answer seems to address a similar, but different question:

  • How can I create a parser class that consumes non-char elements

However, your original question was more along the lines of 'How can I use spirit parsers with a non-char tokenstream'?

In that case, the most helpful link would be to Spirit Lex which is LexerTL integrated into the Boost Spirit framework.

You can easily make Spirit Lex expose token intormation (beyond the token Id) if necessary, though by default the source iterator range is always available. That way you can mix and match Spirit Lex and Spirit Qi in quite flexible ways.

I don't have time to work out a simple example but,