Parsing with user-defined operator precedence

Question

OK, so here's a question: Given that Haskell allows you to define new operators with arbitrary operator precedence... how is it possible to actually parse Haskell source code?

You cannot know what operator precedences are set until you parse the source. But you cannot parse the source until you know the correct operator precedences. So... um, how?

Consider, for example, the expression

x *** y +++ z

Until we finish parsing the module, we don't know what other modules are imported, and hence what operators (and other identifiers) might be in scope. We certainly don't know their precedences yet. But the parser has to return something... But should it return

(x *** y) +++ z

Or should it return

x *** (y +++ z)

The poor parser has no way to know. This can only be determined once you hunt down the import that brings (+++) and (***) into scope, load that file off disk, and discover what the operator precedences are. Clearly the parser itself isn't going to do all that I/O; a parser just turns a stream of characters into an AST.

Clearly somebody somewhere has figured out how to do this. But I can't work it out... Any hints?

You could possibly build an AST with more than two children. Say this specific node gets as children the list [x, ***, y, +++, z], then check the precedence and build a binary node to replace itself afterwards. (There is probably a better approach). — Mephy
Note that you could also do this very easily without any hacks by just having two parse passes, one to grab operator fixity and precedence and one to actually parse the source code. — Cubic

András Kovács András Kovács · Accepted Answer · 2015-03-21T17:22:36

Quoting the page on GHC trac for the parser:

Infix operators are parsed as if they were all left-associative. The renamer uses the fixity declarations to re-associate the syntax tree.

Parsing with user-defined operator precedence

3 Answers