I'm a little stuck on the way "real parsers", such as F# or Haskell, do to parse custom operators. For a "normal" language, we would simply define an AST node at which there would be predefined operator possibilities, for example: +
, -
, *
, ==
, >=
, +=
, ... etc.
But I wonder how to do it in a functional language that allow you to create custom operators, let's take OCaml as an example, quite close to F# (language of my implementation), and is quite well known.
Thus, each operator is a function and has a type, as well as a definition, and we can create our own operators:
val (+) : 'a -> 'a -> 'a
let (+) x y = x + y
val (|>) : 'a -> ('a -> 'b) -> 'b
let (|>) x f = f x
So I wonder how it works with parsing to make it work.
1) How does the parser know that we want to use a custom operator? If we use a function that takes another function in the first argument and another element in the second, how does it know that we call a function rather than using an infix operator?
let example x =
// Do we add up, or do we call the function "takeOpAndOther"?
takeOpAndOther + x
2) To answer this question, I thought of a way to do this in F#, thanks to FParsec. The first solution that came to mind was to simply use OperatorPrecedenceParser
. The concern is that this means only works for predefined operators (or if there is a way to do what I want with it, I don't know how).
Then I thought about creating a simple parser:
open FParsec
type Expression =
| Number of int
| InfixF of Expression * string * Expression
| DataName of string
| FunctionCall of string * Expression list
let ws = skipMany (pchar ' ' <|> pchar '\t') <?> ""
let ws1 = skipMany1 (pchar ' ' <|> pchar '\t') <?> ""
let identifier = many1Satisfy (fun c -> isLetter c || isDigit c)
let allowedSymbols =
[ '!'; '@'; '#'; '$'; '%'; '^'; '&';
'§'; '*'; '°'; '.'; '~'; ':'; '-';
'+'; '='; '?'; '/'; '>'; '<'; '|'; ]
let customOperatorIdentifier = many1SatisfyL (fun c -> allowedSymbols |> List.contains c) "valid custom operator"
// I call about this parser
let rec infixF () = parse {
let! lvalue = ws >>? expression
let! op = ws >>? customOperatorIdentifier
let! rvalue = ws >>? expression
return InfixF(lvalue, op, rvalue)
}
and number = pint32 |>> Number
and dataName = identifier |>> DataName
and functionCall () = parse {
let! id = ws >>? identifier
let! parameters = sepEndBy1 (ws >>? expression) ws1
return FunctionCall(id, parameters)
}
and expression =
attempt number <|>
attempt dataName <|>
attempt (functionCall ()) <|>
infixF ()
let test code =
match run (ws >>? expression .>>? ws .>>? eof) code with
| Success (result, _, _) -> printfn "%A" result
| Failure (msg, _, _) -> printfn "%s" msg
test "87 + 12"
Except, as you might expect, it doesn't work as expected. Indeed, as the code is presented (because when I try infixF
alone, and remove it from expression
, then it works, but obviously only for one expression: x + y
, but not x + y + z
), it will cause an overflow error every time. This is the main problem, I think, I encounter in my implementation.
However, the two solutions described do not satisfy one of my questions, which was the sending of function-operator.
In short... I have some questions I would like to receive explanations about, and an implementation problem I would like to fix.
Thank you! :)
OperatorPrecedenceParser
only works with predefined operators appears unfounded. There's anAddOperator
method that can add new operators dynamically; so when you parse something like F#'slet (>=>) f g = ...
, you can dynamically add the>=>
operator to your parser. I haven't done it myself yet so I can't advise you on specifics, but a bit of experimenting ought to help. – rmunnlet rec (~+) a b = a ~* (b - 1) and (~*) a b = if b<0 then a else (a+b) ~+ b in ...
because~*
is defined lexically after it is used. – Dan Robertson