1
votes

I am new in Antlr4. I am using Antlr4 and antlr4 adaptor to parse C files and generate PSI tree.

I know C preprocessor should deal with the #include and #define part and pass the result to C lexer and C parser. But I need to parse #include and #define for C.g4 , so that my plugin can deal with C file without preprocessor.

I looked into this link and tried the solution, but when it encounters something other than preprocessor statement then it is not able to resolve.

sample C code

#include <stdio.h>

int main()
{
  int i,j;
  for(i=1;i<=9;)
     {
      for(j=1;j<=9;j)
         {
           if(i>=j)
            {
              printf("%d*%d=%d    ",j,i,j*i);
            }
            j++;
         }
          printf("\n");
          i++;
     }
    return 0;
}

result tree is like this

enter image description here

You can see in the image, after include block, there are all elements not a tree.

I deleted #include line in the sample C code, and used original C.g4 grammar, it can resolve a good PSI tree like this below:

enter image description here

Would anyone help me improve the grammar blow ? So that my grammar can resolve #include and #define as preprocessor block in the PSI tree without using any preprocessors.

Thanks

Whitespace
:   [ \t]+
    -> channel(HIDDEN)
;
Newline
:   (   '\r' '\n'?
    |   '\n'
    )
    -> channel(HIDDEN)
;

BlockComment
:   '/*' .*? '*/'
;

LineComment
:   '//' ~[\r\n]*
;


IncludeBlock
 :   '#' Whitespace? 'include' ~[\r\n]*
 ;

DefineStart
:     '#' Whitespace? 'define'
;

DefineBlock
 :   DefineStart (~[\\\r\n] | '\\\\' '\r'? '\n' | '\\'. )*
 ;

MultiDefine
:   DefineStart MultiDefineBody
;

MultiDefineBody
:   [\\] [\r\n]+ MultiDefineBody
|   ~[\r\n]
;

preprocessorDeclaration
:   includeDeclaration
|   defineDeclaration
;

includeDeclaration
:   IncludeBlock
;

defineDeclaration
:   DefineBlock | MultiDefine
;

comment
:   BlockComment
|   LineComment
;

declaration
:   declarationSpecifiers initDeclaratorList ';'
|   declarationSpecifiers ';'
|   staticAssertDeclaration
|   preprocessorDeclaration
|   comment
;
1

1 Answers

0
votes

I tried and solved my own problem.

The trick is newline should be skipped like this grammar below, it should not use channel(HIDDEN).

Newline
:   (   '\r' '\n'?
|   '\n'
)
-> channel(HIDDEN)
;

should be changed to

Newline
:   (   '\r' '\n'?
    |   '\n'
    )
    -> skip
;

the good PSI tree is like this below

good PSI tree with #include and #define

Anyway, I do not fully understand the difference between channel(HIDDEN) and skip.