15
votes

I'm working on a compiler and I would like to improve its performances. I found that about 50% of the time is spent parsing the source files. As the source file are quite small and I do quite a lot of transformations after that, it seems to me that it is perfectible.

My parser is a Boost Spirit parser with a lexer (with lexer::pos_iterator) and I have a middle sized grammar. I'm parsing the source into an AST.

My problem is that I have no idea what takes the most time during the parsing: copies of AST nodes, lexer, parser rules or memory.

I don't think that it is I/O problem since I'm working on a SSD and that I'm reading the file entirely at the beginning and then using only the memory version.

I tried using profilers, but the methods that takes time are some methods from Boost with names hundreds of characters long and I don't know exactly what they do...

So, is there a preferred way to benchmark a Boost Spirit Parser and its grammar ? Or is there some rules that can be used to verify the efficiency at some specific points ?

Thanks

Sources for those interested:

1
Here is a write up by ApochiQ who is using Boost.Spirit as the parser for the Epoch language. He considerably improved the performance of his parser between the 10 and 11 releases and wrote up what he focused on here.Matthieu M.
Benchmarking ANYTHING typically involves running something through the "code under test", and then analysing the results. If you have a complex system, it often helps to produce a "null-driver" or "null-interface", so that you can for example feed in a source file and parse it, without acting on the parsed results.Mats Petersson
@MatthieuM. Yes I know this article. I already followed several advices of this great article a long time ago. But I don't know what advice to follow next.Baptiste Wicht
Do you mind sharing the code under test? I'm interested myselfsehe
Have you run a profiler on your code?Mats Petersson

1 Answers

13
votes

I have given things a quick scan.

My profiler quickly told me that constructing the grammar and (especially) the lexer object took quite some resources.

Indeed, just changing a single line in SpiritParser.cpp saved 40% of execution time1 (~28s down to ~17s):

    lexer::Lexer lexer;

into

    static const lexer::Lexer lexer;

Now,

  • making the grammar static involves making it stateless. I made this happen by

    • moving position_begin into qi::_a (using qi::locals) and
    • passing it in as an inherited attribute at the appropriate times

      • the EDDIGrammar and ValueGrammar grammars, e.g.

        start %= qi::eps [ qi::_a = qi::_r1 ] >> program;
        
      • as well as the individual rules from ValueGrammar that are being used externally).

    This had a number of suboptimal side effects:

    • rule debugging is commented out because the lexer::pos_iterator_type has no default output streaming overload
    • the qi::position(position_begin) expression has been 'faked' with a rather elaborate replacement:

      auto local_pos = qi::lazy (
              boost::phoenix::construct<qi::position>(qi::_a)
          );
      

      That doesn't seem ideal. (Ideally, one would like to replace qi::position by a modified custom parser directive that knows how to get get begin_position from qi::locals (?) so there would be no need to invoke a parser expression lazily?)

Anyways, implementing these further changes2 shaved off another ~15% of execution time:

static const lexer::Lexer lexer;
static const parser::EddiGrammar grammar(lexer); 

try {
    bool r = spirit::lex::tokenize_and_parse(
            position_begin, position_end,
            lexer, 
            grammar(boost::phoenix::cref(position_begin)),
            program);

Loose ideas:

  • Have you considered generating a static lexer (Generating the Static Analyzer)
  • Have you considered using expectation points to potentially reduce the amount of backtracking (note: I didn't measure anything in that area)
  • Have you considered alternatives for Position::file and Position::theLine? Copying the strings seems heavier than necessary. I'd prefer to store const char *. You could also look at Boost Flyweight
  • Is the pre-skip really required inside your qi::position directive?
  • (Somewhat non-serious: have you considered porting to Spirit X3? It seems to promise potential benefits in the form of move semantics.)

Hope this helps.


[1] When parsing all test cases in test/cases/*.eddi 100 times like so (github):

for (int i=0; i<100; i++)
    for (auto& fname : argv)
{
    eddic::ast::SourceFile program;
    std::cout << fname << ": " << std::boolalpha << parser.parse(fname, program, nullptr) << "\n";
}

Timed with a simple

time ./test ../../test/cases/*.eddi | md5sum

With the md5sum acting as a sanity check.

[2] I created a pull request with the proof-of-concept refactorings here https://github.com/wichtounet/eddic/pull/52