1
votes

So I am trying to write a compiler in F# and have been looking at the Fslex and Fsyacc tools that come with the F# powerpack. There is a sample project that takes care of the external build tools that I have been trying to understand. It can be downloaded here. The example compiles and runs for me, but I think there is a subtle error in the grammar. I say subtle, because the grammar looks similar to what I have seen in the Dragon book for parsing expressions and I don't have the experience to spot it.

The input "4*5+3" correctly evaluated to 23.

The input 4*5-3, however, generates a parse error. That is an error in the code generated by Fsyacc.

I would appreciate your help in better understanding what the problem so I can be better informed and have more confidence in Fsyacc. I have posted the *.fsy file below.

// This is the type of the data produced by a successful reduction of the 'start'
// symbol:
%type < Ast.Equation > start

%%

// These are the rules of the grammar along with the F# code of the 
// actions executed as rules are reduced.  In this case the actions 
// produce data using F# data construction terms.
start: Prog { Equation($1) }

Prog:
    | Expr EOF                  { $1 }

Expr: 
    | Expr PLUS  Term           { Plus($1, $3)  }
    | Expr MINUS Term           { Minus($1, $3) }
    | Term                      { Term($1)      }

Term:
    | Term ASTER Factor         { Times($1, $3)  }
    | Term SLASH Factor         { Divide($1, $3) }
    | Factor                    { Factor($1)     }

Factor:
    | FLOAT                     { Float($1)  }
    | INT32                     { Integer($1) }
    | LPAREN Expr RPAREN        { ParenEx($2) }

And here is the definition for AST data type

namespace Ast
open System

type Factor =
    | Float   of Double
    | Integer of Int32
    | ParenEx of Expr

and Term =
    | Times  of Term * Factor
    | Divide of Term * Factor
    | Factor of Factor

and Expr =
    | Plus  of Expr * Term
    | Minus of Expr * Term
    | Term  of Term

and Equation =
    | Equation of Expr

EDIT

I have posted the lexer definition and the code to drive the parser as well to help with understanding the error.

{
module Lexer
open System
open Parser
open Microsoft.FSharp.Text.Lexing

let lexeme lexbuf =
    LexBuffer<char>.LexemeString lexbuf
}

// These are some regular expression definitions
let digit = ['0'-'9']
let whitespace = [' ' '\t' ]
let 

newline = ('\n' | '\r' '\n')

rule tokenize = parse
| whitespace    { tokenize lexbuf }
| newline       { tokenize lexbuf }
// Operators
| "+"           { PLUS }
| "-"           { MINUS }
| "*"           { ASTER }
| "/"           { SLASH }
// Misc
| "("           { LPAREN }
| ")"           { RPAREN }
// Numberic constants
| ['-']?digit+                                  { INT32 (Int32.Parse(lexeme lexbuf)) }
| ['-']?digit+('.'digit+)?(['e''E']digit+)?     { FLOAT (Double.Parse(lexeme lexbuf)) }
// EOF
| eof   { EOF }

Lastly, the code to drive the parser.

    // This project type requires the F# PowerPack at http://fsharppowerpack.codeplex.com/releases
    // Learn more about F# at http://fsharp.net
    // Original project template by Jomo Fisher based on work of Brian McNamara, Don Syme and Matt Valerio
    // This posting is provided "AS IS" with no warranties, and confers no rights.

    open System
    open Microsoft.FSharp.Text.Lexing

    open Ast
    open Lexer
    open Parser

    /// Evaluate a factor
    let rec evalFactor factor =
        match factor with
        | Float x   -> x
        | Integer x -> float x
        | ParenEx x -> evalExpr x

    /// Evaluate a term
    and evalTerm term =
        match term with
        | Times (term1, term2)  -> (evalTerm term1) * (evalTerm term2)
        | Divide (term1, term2)  -> (evalTerm term1) / (evalTerm term2)
        | Factor fact         -> evalFactor fact

    /// Evaluate an expression
    and evalExpr expr =
        match expr with
        | Plus (expr1, expr2)  -> (evalExpr expr1) + (evalExpr expr2)
        | Minus (expr1, expr2)  -> (evalExpr expr1) - (evalExpr expr2)
        | Term term          -> evalTerm term

    /// Evaluate an equation
    and evalEquation eq =
        match eq with
        | Equation expr -> evalExpr expr

    printfn "Calculator"

    let rec readAndProcess() =
        printf ":"
        match Console.ReadLine() with
        | "quit" -> ()
        | expr ->
            try
                printfn "Lexing [%s]" expr
                let lexbuff = LexBuffer<char>.FromString(expr)

                printfn "Parsing..."
                let equation = Parser.start Lexer.tokenize lexbuff

                printfn "Evaluating Equation..."
                let result = evalEquation equation

                printfn "

Result: %s" (result.ToString())

        with ex ->
            printfn "Unhandled Exception: %s" ex.Message

        readAndProcess()

readAndProcess()

EDIT: The optional minus sign in the lexer was the problem. After removing it the sample works as expected.

1
Can you post the lexer definition as well?Juliet
It might work for binary minus, but does it work for the unary?Román
@Román When I compiled and ran it, after fixing the optional minus part of the lexer, both unary and binary minus worked as expected.Samsdram

1 Answers

3
votes

I have only glanced, it looks like the lexer perhaps is treating

// Numberic constants 
| ['-']?digit+                                  { INT32 (Int32.Parse(lexeme lexbuf)) } 
etc

the minus sign here

4*5-3

as unary, part of the constant "-3" rather than as a binary minus. So I agree it is an error in the sample. I would get rid of the optional minus in the lexer, and add a rule in the parser along the lines of Factor going to e.g. "MINUS INT32".

Just a sketch of how to fix it, hopefully this will steer you or you'll get another more in-depth answer with full code.