0
votes

I want to write a new parser for mathematical program. When i write BNF grammar for that, I'm stuck in expressions. And this is the BNF notation I wrote.

program : expression*

expression: additive_expression

additive_expression : multiplicative_expression
                    | additive_expression '+' multiplicative_expression
                    | additive_expression '-' multiplicative_expression

multiplicative_expression : number
                          | number '*' multiplicative_expression
                          | number '/' multiplicative_expression

But i cannot understand how to write BNF expression grammar for ++, --, +=, -=, &&, || etc. this operators. I mean operators in languages like C, C++, C#, Python, Java etc.

I know that when using +, -, *, / operators, grammar should be written according to the BODMAS theory.

What i want to know is whether any theory should be used in writing grammar for other operators?

I took a good look at the C++ Language grammar ( expressions part ). But I can not understand it.

<expression> ::= <assignment_expression>
               | <assignment_expression> <expression>

<assignment_expression> ::= <logical_or_expression> "=" <assignment_expression>
                          | <logical_or_expression> "+=" <assignment_expression>
                          | <logical_or_expression> "-=" <assignment_expression>
                          | <logical_or_expression> "*=" <assignment_expression>
                          | <logical_or_expression> "/=" <assignment_expression>
                          | <logical_or_expression> "%=" <assignment_expression>
                          | <logical_or_expression> "<<=" <assignment_expression>
                          | <logical_or_expression> ">>=" <assignment_expression>
                          | <logical_or_expression> "&=" <assignment_expression>
                          | <logical_or_expression> "|=" <assignment_expression>
                          | <logical_or_expression> "^=" <assignment_expression>
                          | <logical_or_expression>

<constant_expression> ::= <conditional_expression>

<conditional_expression> ::= <logical_or_expression>

<logical_or_expression> ::= <logical_or_expression> "||" <logical_and_expression>
                          | <logical_and_expression>

<logical_and_expression> ::= <logical_and_expression> "&&" <inclusive_or_expression>
                           | <inclusive_or_expression>

<inclusive_or_expression> ::= <inclusive_or_expression> "|" <exclusive_or_expression>
                            | <exclusive_or_expression>

<exclusive_or_expression> ::= <exclusive_or_expression> "^" <and_expression>
                            | <and_expression>

<and_expression> ::= <and_expression> "&" <equality_expression>
                   | <equality_expression>

<equality_expression> ::= <equality_expression> "==" <relational_expression>
                        | <equality_expression> "!=" <relational_expression>
                        | <relational_expression>

<relational_expression> ::= <relational_expression> ">" <shift_expression>
                          | <relational_expression> "<" <shift_expression>
                          | <relational_expression> ">=" <shift_expression>
                          | <relational_expression> "<=" <shift_expression>
                          | <shift_expression>

<shift_expression> ::= <shift_expression> ">>" <addictive_expression>
                     | <shift_expression> "<<" <addictive_expression>
                     | <addictive_expression>

<addictive_expression> ::= <addictive_expression> "+" <multiplicative_expression>
                         | <addictive_expression> "-" <multiplicative_expression>
                         | <multiplicative_expression>

<multiplicative_expression> ::= <multiplicative_expression> "*" <unary_expression>
                              | <multiplicative_expression> "/" <unary_expression>
                              | <multiplicative_expression> "%" <unary_expression>
                              | <unary_expression>

<unary_expression> ::= "++" <unary_expression>
                     | "--" <unary_expression>
                     | "+" <unary_expression>
                     | "-" <unary_expression>
                     | "!" <unary_expression>
                     | "~" <unary_expression>
                     | "size" <unary_expression>
                     | <postfix_expression>

<postfix_expression> ::= <postfix_expression> "++"
                       | <postfix_expression> "--"
                       | <primary_expression>

<primary_expression> ::= <integer_literal>
                       | <floating_literal>
                       | <character_literal>
                       | <string_literal>
                       | <boolean_literal>
                       | "(" <expression> ")"
                       | IDENTIFIER

<integer_literal> ::= INTEGER

<floating_literal> ::= FLOAT

<character_literal> ::= CHARACTER

<string_literal> ::= STRING

<boolean_literal> ::= "true"
                    | "false"

Can you help me understand this? I searched a lot on the internet about this. But I could not find the right solution for this.

Thank you

1
What do you find hard to understand in the C grammar?rici
@rici Why that grammar is in the same order. Most of the languages I looked at had the same order. Is there any reason for that? :)Lasan Nishshanka
the order is based on someone's idea of what programmers expect, or what they will find useful once they get used to it. There are no rules; operators don't have intrinsic precedence. But there are some common expectations. You probably expect that if (a > b + 7) intends to compare a with b + 7, rather than adding 7 to the boolean result of comparing a with b. So some of these decisions have a basis in common use and others are more or less arbitrary.rici
@rici Than you very muchLasan Nishshanka

1 Answers

0
votes

I gather that the problem is not that you don't know how to write BNF grammars. Rather, you don't know which grammar you should write. In other words, if someone told you the precedence order for your operators, you would have no trouble writing down a grammar which parsed the operators with that particular precedence order. But you don't know which precedence order you should use.

Unfortunately there is not an International High Commission on Operator Precedence, and neither does any religion that I know of offer a spiritual reference including divinely inspired operator precedence rules.

For some operators, the precedence order is reasonably clear: BODMAS was adopted for good reasons, for example, but the main reason is that most people already wrote arithmetic according to those rules. You can certainly make a plausible mathematical argument based on group theory as to why it seems natural to give multiplication precedence over addition, but there will always be the suspicion that the argument was produced post facto to justify an already-made decision. Be that as it may, the argument for making multiplication bind more tightly than addition also works for giving bitwise-and precedence over bitwise-or, or boolean-and precedence over boolean-or. (Although I note that C compiler writers don't trust that programmers will have that particular intuition, since most modern compilers issue a warning if you omit redundant parentheses in such expressions.)

One precedence relationship which I think just about everyone would agree was intuitive is giving arithmetic operators precedence over comparison operators, and comparison operators precedence operators precedence over boolean operators. Most programmers would find a language which chose to interpret a > b + 7 as meaning "add seven to the boolean value of comparing a with b". Similarly, it might be considered outrageous to interpret a > 0 || b > 0 in any way other than (a > 0) || (b > 0). But other precedence choices seem a lot more arbitrary, and not all languages make them the same. (For example, the relative precedence of "not equal to" and "greater than", or of "exclusive or" and "inclusive or".

So what's a novice language designer to do? Well, first appeal to your own intuitions, particularly if you have used the operators in question a lot. Also, ask your friends, contacts, and potential language users what they think. Look at what other languages have done, but look with a critical eye. Search for complaints on language-specific forums. It may well be that the original language designer made an unfortunate (or even stupid) choice, and that choice now cannot be changed because it would break too many existing programs. (Which is why it is a good thing that you are worrying about operator precedence: getting it wrong can bring serious future problems.)

Direct experimentation can help, too, particularly with operators which are rarely used together. Define a plausible expression using the operators, and then write it out two (or more) times leaving out parentheses according to the various possible rules. Which of those looks more understandable? (Again, recruit your friends and ask them the same question.)

If, in the end, you really cannot decide what the precedence order between a particular pair of operators is, you can consider one more solution: Make parentheses mandatory for these operators. (Which is the intent of the C compiler whining about leaving out redundant parentheses in boolean expressions.)