Minimizing boost::spirit compile times

Question

Any ideas for reducing boost::spirit compile time?

I have just ported a flex parser to boost::spirit. The EBNF has about 25 rules.

The result runs well and the runtime performance is fine.

The problem is that compile takes for ever! It takes about ten minutes and requires almost a gigabyte of memory. The original flex parser compiled in a few seconds.

I am using boost version 1.44.0 and Visual Studio 2008.

In Joel de Guzman's article 'Best Practices' it says

Rules with complex definitions hurt the compiler badly. We’ve seen rules that are more than a hundred lines long and take a couple of minutes to compile

Well I do not have anything near that long, but my compile still takes way more than a couple of minutes

Here is the most complex part of my grammar. Is it a candidate for being broken up into smaller parts, in some way?

    rule
        =   (   tok.if_ >> condition  >> tok.then_ >> *sequel  )                            [ bind( &cRuleKit::AddRule, &myRulekit ) ]
        |   (   tok.if_ >> condition  >> tok.and_ >> condition >> tok.then_ >> *sequel  )   [ bind( &cRuleKit::AddRule, &myRulekit ) ]
        ;

    condition
        =   ( tok.identifier >> tok.oper_ >> tok.value )                                    [ bind( &cRuleKit::AddCondition, &myRulekit, _pass, _1, _2, _3 ) ]
        |   ( tok.identifier >> tok.between_ >> tok.value >> "," >> tok.value )             [ bind( &cRuleKit::AddConditionBetween, &myRulekit, _pass, _1, _3, _4 ) ]
        ;

    sequel
        =   ( tok.priority_ >> tok.high_ )      [ bind( &cRuleKit::setPriority, &myRulekit, 3 ) ]
        |   ( tok.priority_  )                  [ bind( &cRuleKit::setPriority, &myRulekit, 2 ) ]
        |   ( tok.interval_ >> tok.value )      [ bind( &cRuleKit::setInterval, &myRulekit, _2 ) ]
        |   ( tok.mp3_ >> tok.identifier )      [ bind( &cRuleKit::setMP3, &myRulekit, _2 ) ]
        |   ( tok.disable_ )                    [ bind( &cRuleKit::setNextRuleEnable, &myRulekit, false ) ]
        ;

By commenting out parts of the grammar, I discovered which part the compiler was spending most time with.

        set_reading
            =   tok.set_reading >> +attribute_reading
            ;

        attribute_reading
            =   ( tok.name_ >> tok.identifier )
                [
                                                bind( &cPackage::Add, &myReadings, _pass, _2 )
                ]
            |   ( tok.nmea_ >> tok.identifier )
                [
                                                bind( &cPackage::setNextNMEA, &myReadings, _2 )
                ]
            |   ( tok.column_ >> tok.integer )
                [
                                                bind( &cPackage::setNextColumn, &myReadings, _2 )
                ]
            |   ( tok.precision_ >> tok.value )
                [
                                                bind( &cPackage::setNextPrecision, &myReadings, _2 )
                ]
            |   ( tok.unit_ >> tok.identifier )
                [
                                                bind( &cPackage::setNextUnit, &myReadings, _2 )
                ]
            |   ( tok.value_ >> tok.identifier )
                [
                                                bind( &cPackage::setNextValue, &myReadings, _2 )
                ]
            |   ( tok.qualifier_ >> tok.identifier >> tok.qual_col_ >> tok.integer )
                [
                                                bind( &cPackage::setNextQualifier, &myReadings, _2, _4 )
                ]
            ;

I wouldn't call it complex, but it is certainly the longest rule. So I thought I would try splitting it up, like this:

    set_reading
        =   tok.set_reading >> +attribute_reading
        ;

    attribute_reading
        =   attribute_reading_name
        |   attribute_reading_nmea
        |   attribute_reading_col
        |   attribute_reading_precision
        |   attribute_reading_unit
        |   attribute_reading_value
        |   attribute_reading_qualifier
        ;



    attribute_reading_name
        =   ( tok.name_ >> tok.identifier )     [ bind( &cPackage::Add, &myReadings, _pass, _2 ) ]
        ;
    attribute_reading_nmea
        =   ( tok.nmea_ >> tok.identifier )     [ bind( &cPackage::setNextNMEA, &myReadings, _2 ) ]
        ;
    attribute_reading_col
        =   ( tok.column_ >> tok.integer )      [ bind( &cPackage::setNextColumn, &myReadings, _2 ) ]
        ;
    attribute_reading_precision
        =   ( tok.precision_ >> tok.value )     [ bind( &cPackage::setNextPrecision, &myReadings, _2 ) ]
        ;
    attribute_reading_unit
        =   ( tok.unit_ >> tok.identifier )     [ bind( &cPackage::setNextUnit, &myReadings, _2 ) ]
        ;
    attribute_reading_value
        =   ( tok.value_ >> tok.identifier )    [ bind( &cPackage::setNextValue, &myReadings, _2 ) ]
        ;
    attribute_reading_qualifier
        =   ( tok.qualifier_ >> tok.identifier >> tok.qual_col_ >> tok.integer ) [ bind( &cPackage::setNextQualifier, &myReadings, _2, _4 ) ]

        ;

This saves several minutes from the total compile time!!!

Strangely, the peak memory requirement remains the same, it is just required for less time

So, I am feeling a bit more hopeful that all my efforts in learning boost::spirit are going to be worthwhile.

I do think it is a bit strange that the compiler needs to be so carefully guided in this way. I would have thought a modern compiler would have noticed that this rule was just a list of independent OR rules.

I have spent the best part of seven days learning boost::spirit and porting a small, but real world, parser from flex. My conclusion is that it works and the code is very elegant. Unfortunately, naive use by simply expanding the tutorial example code for a real application quickly overburdens the compiler - memory and time taken for a compile becomes completely impractical. Apparently there are techniques for managing this problem but they require arcane knowledge which I do not have the time to learn. I guess I will stick to flex which may be ugly and old-fashioned but is relatively simple and lightning fast.

hkaiser hkaiser · Accepted Answer · 2011-03-31T22:52:17

One trick I could suggest is to separate the compilation of the constructors of both, your lexer and your grammar. The easiest way to achieve this is to leave only the declaration of those constructors in their respecitive header files and to move the definition of those functions into separate translation units. For instance:

grammar.hpp:

template <typename Iterator>
struct grammar : qi::grammar<Iterator>
{
    grammar();   // declaration only
    // ...
};

grammar_def.hpp:

// This file should not contain anything else.
#include "grammar.hpp"

// Definition of constructor.
template <typename Iterator>
grammar<Iterator>::grammar()
{
    // initialize your rules here
}

grammar.cpp:

// This file should not contain anything else.
#include "grammar_def.hpp"

// Explicitly instantiate the constructor for the iterator type
// you use to invoke the grammar (here, as an example I use 
// std::string::const_iterator).
typedef std::string::const_iterator iterator_type;
template grammar<iterator_type>::grammar();

Do the same thing for the lexer object.

This approach requires a bit more work than the straight method, but it allows to distribute the memory and time requirements for the overall compilation. Another advantage of this approach is that any change in the grammar constructor does not require the recompilation of anything except the file grammar.cpp.

Another advice for the lexer: try to minimize the use of token_def<> instances as much as possible. You need to use token_def<> only when you want to access the token value as an attribute during parsing. In all other cases you might get away with lex::string or lex::char_ to define your tokens.

Minimizing boost::spirit compile times

2 Answers