So I am trying to write a compiler in F# and have been looking at the Fslex and Fsyacc tools that come with the F# powerpack. There is a sample project that takes care of the external build tools that I have been trying to understand. It can be downloaded here. The example compiles and runs for me, but I think there is a subtle error in the grammar. I say subtle, because the grammar looks similar to what I have seen in the Dragon book for parsing expressions and I don't have the experience to spot it.
The input "4*5+3" correctly evaluated to 23.
The input 4*5-3, however, generates a parse error. That is an error in the code generated by Fsyacc.
I would appreciate your help in better understanding what the problem so I can be better informed and have more confidence in Fsyacc. I have posted the *.fsy file below.
// This is the type of the data produced by a successful reduction of the 'start'
// symbol:
%type < Ast.Equation > start
%%
// These are the rules of the grammar along with the F# code of the
// actions executed as rules are reduced. In this case the actions
// produce data using F# data construction terms.
start: Prog { Equation($1) }
Prog:
| Expr EOF { $1 }
Expr:
| Expr PLUS Term { Plus($1, $3) }
| Expr MINUS Term { Minus($1, $3) }
| Term { Term($1) }
Term:
| Term ASTER Factor { Times($1, $3) }
| Term SLASH Factor { Divide($1, $3) }
| Factor { Factor($1) }
Factor:
| FLOAT { Float($1) }
| INT32 { Integer($1) }
| LPAREN Expr RPAREN { ParenEx($2) }
And here is the definition for AST data type
namespace Ast
open System
type Factor =
| Float of Double
| Integer of Int32
| ParenEx of Expr
and Term =
| Times of Term * Factor
| Divide of Term * Factor
| Factor of Factor
and Expr =
| Plus of Expr * Term
| Minus of Expr * Term
| Term of Term
and Equation =
| Equation of Expr
EDIT
I have posted the lexer definition and the code to drive the parser as well to help with understanding the error.
{
module Lexer
open System
open Parser
open Microsoft.FSharp.Text.Lexing
let lexeme lexbuf =
LexBuffer<char>.LexemeString lexbuf
}
// These are some regular expression definitions
let digit = ['0'-'9']
let whitespace = [' ' '\t' ]
let
newline = ('\n' | '\r' '\n')
rule tokenize = parse
| whitespace { tokenize lexbuf }
| newline { tokenize lexbuf }
// Operators
| "+" { PLUS }
| "-" { MINUS }
| "*" { ASTER }
| "/" { SLASH }
// Misc
| "(" { LPAREN }
| ")" { RPAREN }
// Numberic constants
| ['-']?digit+ { INT32 (Int32.Parse(lexeme lexbuf)) }
| ['-']?digit+('.'digit+)?(['e''E']digit+)? { FLOAT (Double.Parse(lexeme lexbuf)) }
// EOF
| eof { EOF }
Lastly, the code to drive the parser.
// This project type requires the F# PowerPack at http://fsharppowerpack.codeplex.com/releases
// Learn more about F# at http://fsharp.net
// Original project template by Jomo Fisher based on work of Brian McNamara, Don Syme and Matt Valerio
// This posting is provided "AS IS" with no warranties, and confers no rights.
open System
open Microsoft.FSharp.Text.Lexing
open Ast
open Lexer
open Parser
/// Evaluate a factor
let rec evalFactor factor =
match factor with
| Float x -> x
| Integer x -> float x
| ParenEx x -> evalExpr x
/// Evaluate a term
and evalTerm term =
match term with
| Times (term1, term2) -> (evalTerm term1) * (evalTerm term2)
| Divide (term1, term2) -> (evalTerm term1) / (evalTerm term2)
| Factor fact -> evalFactor fact
/// Evaluate an expression
and evalExpr expr =
match expr with
| Plus (expr1, expr2) -> (evalExpr expr1) + (evalExpr expr2)
| Minus (expr1, expr2) -> (evalExpr expr1) - (evalExpr expr2)
| Term term -> evalTerm term
/// Evaluate an equation
and evalEquation eq =
match eq with
| Equation expr -> evalExpr expr
printfn "Calculator"
let rec readAndProcess() =
printf ":"
match Console.ReadLine() with
| "quit" -> ()
| expr ->
try
printfn "Lexing [%s]" expr
let lexbuff = LexBuffer<char>.FromString(expr)
printfn "Parsing..."
let equation = Parser.start Lexer.tokenize lexbuff
printfn "Evaluating Equation..."
let result = evalEquation equation
printfn "
Result: %s" (result.ToString())
with ex ->
printfn "Unhandled Exception: %s" ex.Message
readAndProcess()
readAndProcess()
EDIT: The optional minus sign in the lexer was the problem. After removing it the sample works as expected.