2
votes

I have written a flex program to detect a given word is verb or not. Input will be taken from a text file.I want to improve the code. Is there any way to detect single or multi line strings in the input file( say: "I am a boy" or "I am a boy \ I love football"). In such cases the output will be "single/multi line string is found" that's what i want from my program. How can I do these? please help. This is my sample code:

%%

[\t]+

is   |

am   |

are  |

was  |

were {printf("%s: is a verb",yytext);}

[a-zA-Z]+ {printf("%s: is a verb",yytext);}

. |\n

%%

int main(int argc, char *argv[]){    
    yyin = fopen(argv[1], "r");    
    yylex();         
    fclose(yyin);
}
2

2 Answers

2
votes

It's quite easy to add a single rule to your lexer to recognise strings (that can be spread over several lines):

%%
["][^"]*["] {printf("'%s': is a string\n", yytext); }
[a-zA-Z]+ {printf("%s: is a word\n",yytext); }
[ \t\n]+
.
%%
int main(int argc, char *argv[]){    
    yyin = fopen(argv[1], "r");    
    yylex();         
    fclose(yyin);
}

(I tidied it up a bit to focus on the string vs no-string demonstration.)

0
votes

Flex generates a scanner, and a scanner is generally intended to identify individual tokens, in this case words or newlines. It reads only enough characters from the input to determine what this particular token is, and doesn't have any lookahead except from that. If you want to do something when a newline is found somewhere in the input, as one of a sequence of tokens, that would be better handled by a parser, for example one generated by Yacc or Bison.