1
votes

I have a Praser

package app
import scala.util.parsing.combinator._


class MyParser extends JavaTokenParsers {
  import MyParser._
  
  def expr =
    plus | sub | multi | divide | num
  
  def num = floatingPointNumber ^^ (x => Value(x.toDouble).e)

  def plus = num ~ rep("+" ~> num) ^^ {
    case num ~ nums => nums.foldLeft(num.e) {
      (x, y) => Operation("+", x, y)
    }
  }

  def sub = num ~ rep("-" ~> num) ^^ {
    case num ~ nums => nums.foldLeft(num.e){
      (x, y) => Operation("-", x, y)
    }
  }

  def multi = num ~ rep("*" ~> num) ^^ {
    case num ~ nums => nums.foldLeft(num.e){
      (x, y) => Operation("*", x, y)
    }
  }

  def divide = num ~ rep("/" ~> num) ^^ {
    case num ~ nums => nums.foldLeft(num.e){
      (x, y) => Operation("/", x, y)
    }
  }
}

object MyParser {
  sealed trait Expr {
    def e = this.asInstanceOf[Expr]
    def compute: Double = this match {
      case Value(x) => x
      case Operation(op, left, right) => (op : @unchecked) match {
        case "+" => left.compute + right.compute
        case "-" => left.compute - right.compute
        case "*" => left.compute * right.compute
        case "/" => left.compute / right.compute
      }
    }
  }

  case class Value(x: Double) extends Expr
  case class Operation(op: String, left: Expr, right: Expr) extends Expr
}

and I use it to parse the expression something

package app

object Runner extends App {
  val p = new MyParser
  println(p.parseAll(p.expr, "1 * 11"))
}

it prints

[1.3] failure: end of input expected

1 * 11
  ^

but if I parse the expression 1 + 11, it will succeed in parsing it.

[1.7] parsed: Operation(+,Value(1.0),Value(11.0))

and I can parse something through the plus , multi , divide , num , sub combinator , but the expr combinator only can parse the first item of the or combinator . so why it only can parse the first item of the expr parser? And how can I change the definition of the parsers to make the parse successful ?

2

2 Answers

1
votes

The problem is that you're using rep which matches zero or more times.

def rep[T](p: => Parser[T]): Parser[List[T]] = rep1(p) | success(List())

you need to use rep1 instead which would require at least one match.

If you replace all rep with rep1, your code works.

Check out the changes on scastie

0
votes

Run an experiment:

println(p.parseAll(p.expr, "1 + 11"))
println(p.parseAll(p.expr, "1 - 11"))
println(p.parseAll(p.expr, "1 * 11"))
println(p.parseAll(p.expr, "1 / 11"))

What will happen?

[1.7] parsed: Operation(+,Value(1.0),Value(11.0))
[1.3] failure: end of input expected
1 - 11
  ^
[1.3] failure: end of input expected
1 * 11
  ^
[1.3] failure: end of input expected
1 / 11

+ is consumed, but everything else fails. Let's change def expr definition

  def expr =
    multi | plus | sub | divide | num
[1.3] failure: end of input expected
1 + 11
  ^
[1.3] failure: end of input expected
1 - 11
  ^
[1.7] parsed: Operation(*,Value(1.0),Value(11.0))
[1.3] failure: end of input expected
1 / 11
  ^

By moving multi to the beginning, * case passed, but + failed.

  def expr =
    num | multi | plus | sub | divide
[1.3] failure: end of input expected
1 + 11
  ^
[1.3] failure: end of input expected
1 - 11
  ^
[1.3] failure: end of input expected
1 * 11
  ^
[1.3] failure: end of input expected
1 / 11

With num as the first case everything fails. It is apparent now that this code

num | multi | plus | sub | divide

is NOT matching if any of its parts match, but only if the first one matches.

What does docs says about it?

   /** A parser combinator for alternative composition.
     *
     *  `p | q` succeeds if `p` succeeds or `q` succeeds.
     *   Note that `q` is only tried if `p`s failure is non-fatal (i.e., back-tracking is allowed).
     *
     * @param q a parser that will be executed if `p` (this parser) fails (and allows back-tracking)
     * @return a `Parser` that returns the result of the first parser to succeed (out of `p` and `q`)
     *         The resulting parser succeeds if (and only if)
     *         - `p` succeeds, ''or''
     *         - if `p` fails allowing back-tracking and `q` succeeds.
     */
  def | [U >: T](q: => Parser[U]): Parser[U] = append(q).named("|")

Important note: back tracking has to be allowed. If it isn't, then failure to match the first parser, will results in failing the alternative without trying the second parser at all.

How to make your parser backtracking? Well, you would have to use PackratParsers as this is the only parser in the library that supports backtracking. Or rewrite your code to not rely on backtracking in the first place.

Personally, I recommend not using Scala Parser Combinators and instead use a library where you explicitly decide when you can still backtrack, and when you should not allow it, like e.g. fastparse.