As I didn't know what to look for in the DecisionInfo object, here's what I've found, and helped me improve the parse time by at least an order of magnitude.
First, I enabled profiling on the grammar with org.antlr.v4.runtime.Parser.setProfile(boolean profile)
, then executed the parser with org.antlr.v4.runtime.Parser.getInterpreter().setPredictionMode(PredictionMode.SLL)
on thousands of files, and browsed through the rules with highest prediction time:
Arrays.stream(parser.getParseInfo().getDecisionInfo())
.filter(decision -> decision.timeInPrediction > 100000000)
.sorted((d1, d2) -> Long.compare(d2.timeInPrediction, d1.timeInPrediction))
.forEach(decision -> System.out.println(
String.format("Time: %d in %d calls - LL_Lookaheads: %d Max k: %d Ambiguities: %d Errors: %d Rule: %s",
decision.timeInPrediction / 1000000,
decision.invocations, decision.SLL_TotalLook,
decision.SLL_MaxLook, decision.ambiguities.size(),
decision.errors.size(), Proparse.ruleNames[Proparse._ATN.getDecisionState(decision.decision).ruleIndex])))
and then on highest max lookahead by using the same lamba except:
filter(decision -> decision.SLL_MaxLook > 50).sorted((d1, d2) -> Long.compare(d2.SLL_MaxLook, d1.SLL_MaxLook))
This gave me 4 rules where most of the time was spent, and in this case that was enough to see what had to be changed (by knowing where to look for problems).