Haskell Parsec: Touching Operators

Question

I have a logical language defined by the following BNF.

 formula ::= true
           | false
           | var
           | formula & formula
           | [binder] formula

 binder  ::= var 
           | $var

Essentially, this allows for formulae such as x & true, [x]x and [$x](x & true). The semantics are not important here; but the essential thing is that I have these square bracketed expressions appearing before formulae, and inside those square bracketed expressions, identifiers may or may not be preceded by a dollar sign ($). Now I have used Haskell's Parsec library to help me construct a parser for this language, detailed below.

module LogicParser where

import System.IO
import Control.Monad
import Text.ParserCombinators.Parsec
import Text.ParserCombinators.Parsec.Expr
import Text.ParserCombinators.Parsec.Language
import qualified Text.ParserCombinators.Parsec.Token as Token

-- Data Structures
data Formula = LVar String
             | TT
             | FF
             | And Formula Formula
             | Bound Binder Formula
             deriving Show

data Binder = BVar String
             | FVar String
             deriving Show

-- Language Definition
lang :: LanguageDef st
lang =
    emptyDef{ Token.identStart      = letter
            , Token.identLetter     = alphaNum
            , Token.reservedOpNames = ["&", "$", "[", "]"]
            , Token.reservedNames   = ["tt", "ff"]
            }


-- Lexer for langauge
lexer = 
    Token.makeTokenParser lang


-- Trivial Parsers
identifier     = Token.identifier lexer
keyword        = Token.reserved lexer
op             = Token.reservedOp lexer
roundBrackets  = Token.parens lexer
whiteSpace     = Token.whiteSpace lexer

-- Main Parser, takes care of trailing whitespaces
formulaParser :: Parser Formula
formulaParser = whiteSpace >> formula

-- Parsing Formulas
formula :: Parser Formula
formula = andFormula
        <|> formulaTerm

-- Term in a Formula
formulaTerm :: Parser Formula
formulaTerm = roundBrackets formula
            <|> ttFormula
            <|> ffFormula
            <|> lvarFormula
            <|> boundFormula

-- Conjunction
andFormula :: Parser Formula
andFormula = 
    buildExpressionParser [[Infix (op "&" >> return And) AssocLeft]] formulaTerm

-- Bound Formula
boundFormula :: Parser Formula
boundFormula = 
    do  op "["
        v <- var
        op "]"
        f <- formulaTerm
        return $ Bound v f

-- Truth
ttFormula :: Parser Formula
ttFormula = keyword "tt" >> return TT

-- Falsehood
ffFormula :: Parser Formula
ffFormula = keyword "ff" >> return FF

-- Logical Variable
lvarFormula :: Parser Formula
lvarFormula =
    do  v <- identifier
        return $ LVar v

-- Variable
var :: Parser Binder
var = try bvar <|> fvar

-- Bound Variable
bvar :: Parser Binder
bvar =
    do  op "$"
        v <- identifier
        return $ BVar v

-- Free Variable
fvar :: Parser Binder
fvar =
    do  v <- identifier
        return $ FVar v


-- For testing
main :: IO ()
main = interact (unlines . (map stringParser) . lines)

stringParser :: String -> String
stringParser s = 
    case ret of
        Left e ->  "Error: " ++ (show e)
        Right n -> "Interpreted as: " ++ (show n)
    where 
        ret = parse formulaParser "" s

My issue is the following. When the dollar sign operator ($) touches the square bracket, I get an error, whereas if I add a space, the parser works fine:

How can I get the parser to recognise [$x](x & true)? Note that it has no issue with & touching its operands, only when the two operators [ and $ touch.

amalloy amalloy · Accepted Answer · 2019-03-19T15:45:27

I am not very familiar with the token side of Parsec, but from its documentation I think you need to provide opLetter and possibly opStart, since you are providing reservedOp:

opLetter :: ParsecT s u m Char    
This parser should accept any legal tail characters of operators. Note that this parser should even be defined if the language doesn't support user-defined operators, or otherwise the reservedOp parser won't work correctly.

In particular the default opLetter doesn't include [ or ], so it is behaving erratically when you think one of those should be an op.

Haskell Parsec: Touching Operators

3 Answers