2
votes

I'd like to build Markdown as Xtext DSL. But it seems tricky to parse headings like # Introduction, because it has no definite end-symbol. Is there any way to express this? Or is it generally (also not alone a Xtext limitation) not possible?

Here is my Xtext:

grammar markdown.Markdown with org.eclipse.xtext.common.Terminals

generate markdown "http://www.Markdown.markdown"

Model:
  entities+=Entity*;

Entity:
  Section | Subsection | Paragraph
;

Section:
  '#'
    content+=TextPart
  '::'
;

Subsection:
  '##'
    content+=TextPart
  '::'
;

Paragraph:
  content+=TextPart
;

TextPart:
  text=Text
;

Text:
        (ID | WS | SINGLE_NL | MULTI_NL | ANY_OTHER | '\\[' | '\\]' | ',' | "-" | '\\:' | '\\%' | '\\#' | '\\##' )+;


terminal ID:
        ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | '0'..'9')*;

terminal SL_COMMENT:
        '%%' !('\n' | '\r')* ('\r'? '\n');

terminal MULTI_NL:
        '\r'? '\n' (/*(' ' | '\t')**/ '\r'? '\n')+;

terminal SINGLE_NL:
        '\r'? '\n';

terminal WS:
        ' ' | '\t';

terminal ANY_OTHER:
        .;

The terminals are from Xdoc. With this grammar-rules is this possible:

# Introduction ::

Lorem ipsum.

## Other chapter ::

Lorem ipsum.

But I'd like Markdown like this:

# Introduction

Lorem ipsum.

## Other chapter

Lorem ipsum.

Therefore we need \n instead of :: as ending. But is this possible? Furthermore antlr produces warnings caused by the terminal-rules. But this warnings don't occur when building Xdoc.xtext. What is my fault?

warning(200): ../markdown/src-gen/markdown/parser/antlr/internal/InternalMarkdown.g:436:1: Decision can match input such as "'-'" using multiple alternatives: 9, 14
As a result, alternative(s) 14 were disabled for that input
warning(200): ../markdown/src-gen/markdown/parser/antlr/internal/InternalMarkdown.g:436:1: Decision can match input such as "'\\['" using multiple alternatives: 6, 14
As a result, alternative(s) 14 were disabled for that input
...
1

1 Answers

4
votes

The single line comment terminal defined in Xtext base languages is really similar:

terminal SL_COMMENT: '//' !('\n'|'\r')* ('\r'? '\n')?;

The basic idea is: the symbol can contain any characters that are not end of line characters.