6
votes

I'm trying to understand Alex and lexers in general but I'm having trouble to run my lexer.

I wrote lexers in "basic" and "posn" wrappers but I couldn't in "monad" wrapper. I think I have to use monad wrapper because I need to collect strings and token positions in input. I also need multiple states. For now I'm trying to run this simple exmaple:

{
module Main (main) where
}

%wrapper "monad"

$whitespace = [\ \b\t\n\f\v\r]
$digit      = 0-9
$alpha      = [a-zA-Z_]
$upper      = [A-Z]
$lower      = [a-z]

@tidentifier = $upper($alpha|_|$digit)*
@identifier  = $lower($alpha|_|$digit)*


tokens :-

$whitespace+ ;
$upper $alpha+ { typeId }
$lower $alpha+ { id_ }
$digit+ { int }

{

data Lexeme = L AlexPosn LexemeClass String

data LexemeClass
        = TypeId String
        | Id String
        | Int Int
        | EOF
    deriving (Show, Eq)

typeId :: AlexInput -> Int -> Alex Lexeme
typeId = undefined

id_ :: AlexInput -> Int -> Alex Lexeme
id_ = undefined

int :: AlexInput -> Int -> Alex Lexeme
int = undefined

alexEOF = return (L undefined EOF "")

main :: IO ()
main = do
    s <- getContents
    let r = runAlex s $ do
                return alexMonadScan
    print r
}

My actions are undefined for now. When I try to compile it, I'm getting this error:

➜  haskell  ghc --make Tokens.hs
[1 of 1] Compiling Main             ( Tokens.hs, Tokens.o )

templates/wrappers.hs:208:17:
    Couldn't match expected type `(AlexPosn, Char, [Byte], String)'
                with actual type `(t0, t1, t2)'
    Expected type: AlexInput
      Actual type: (t0, t1, t2)
    In the return type of a call of `ignorePendingBytes'
    In the first argument of `action', namely
      `(ignorePendingBytes inp)'

I'm also getting various errors when I try to compile examples in Alex's github repo, could it be related with a version mismatch? I've installed alex from cabal with ghc 7.0.4. Any ideas?

1

1 Answers

7
votes

This looks like a bug in Alex 3.0.1. It works fine in version 2.3.3 after dealing with some other unrelated issues in your code1. The problem is this line in the generated code:

ignorePendingBytes (p,c,ps,s) = (p,c,s)

By following the types in the generated code, it seems like this function should have the type AlexInput -> AlexInput, but AlexInput obviously can't be both a 3-tuple and a 4-tuple.

This likely occurred because the definition of AlexInput was changed between the two versions.

type AlexInput = (AlexPosn, Char, String)         -- v2.3.3
type AlexInput = (AlexPosn, Char, [Byte], String) -- v3.0.1

From what I can tell, the correct code should be

ignorePendingBytes (p,c,ps,s) = (p,c,[],s)

and manually making this change in the generated code makes it compile after dealing with the other issues.

However, unless you need something from 3.0.1, I suggest downgrading until this is fixed, as having to maintain patches against generated code is usually more trouble than it's worth.

1 Your code is missing a Show instance for Lexeme and you're also calling return on alexMonadScan, which is already in the Alex monad.