0
votes

Hi i'm trying to understand the best approach to do lexical analysis. I did some research. I'm bit confused. please correct me if i"m wrong. For lexical analysis there are basically two ways.

  1. using context free grammar
  2. using regular expressions

And it says RE -> lexer generator -> Lexer

   (ML-LEX)

and CFG -> parser generator -> parser

   (ML-YACC)

but why for CFG they haven't use the word lexer generator? still we have to generate tokens right? from CFG we have to generate token and pass to the parser right? please correct me if i'm wrong...And also they have said using CFG is better because any language that can be generated using RE can be generated using CFG. but most of the programming languages use RE as for lexical analysis.. i couldn't find a reason for that as well..

1

1 Answers

1
votes

A context-free grammar that describes the tokens in a language is usually longer, and harder to write, than the set of regular expressions that describe the same tokens.

Context-free grammars are more powerful than regular expressions, in the sense that you can describe a larger class of languages, but as long as you can use regular expressions, that is usually easier.