I am trying to build a scanner for AWK source code using (F)Lex analysis. I have been able to identify AWK keyworkds, comments, string literals, and digits however I am stuck on how to generate regular expressions for matching variable instance names since these are quite dynamic.
Could someone please help me develop a regular expression for matching AWK variables. http://pubs.opengroup.org/onlinepubs/009695399/utilities/awk.html provides definition for the AWK language.
Variables must start with a letter but can be alphanumerical without regard to case. The only special character that can be used is an underscore ("_"). I apologize I am not very experienced with REGEX let alone regular expressions for FLEX.
Thank you for your help.
compat
awk script that is in the original 'The Awk Programming Language'. It lists all the keywords/functions in the (Old) awk specification. Intended to help those migrating to (new) awk to find name collisions in old code, but has a good general solution for looking at awk code and a list a many of the functions/key words. You can easily extend it with the new keywords. (Just as a cross check). Good luck. – shellter