Paragraph breaks will be associated with the deepest child that will accept them. These paragraph breaks can be moved around in a post-processing pass over the tree (which must happen anyways to parse inline syntax). This also allows for child-type-dependent processing of paragraph breaks. |
||
---|---|---|
market | ||
mcgoron/srfi | ||
tests | ||
COPYING | ||
prelude.scm | ||
README.md |
Market
Market is an opinionated dialect of Markdown written in portable Scheme. It is designed to be easy to extend and understand.
It is not a CommonMark parser. For a Scheme CommonMark parser, see Guile CommonMark.
Block Syntax
Market uses significant indentation to make documents more regular and to make parsing easier.
Market parse line by line, similar to CommonMark. It also parses blocks before inline syntax. To parse blocks:
- The beginning of each line is checked for the correct indentation
recursively from the top of the document tree. (Unlike other Markdowns,
lazy lines are not allowed.) Indentation is usually space, but can
include
>
,|
, etc. - The rest of the line is checked for a block start.
- If there is no block start, then it is a paragraph line.
In practice, each block has a continues
procedure that checks if the
current fragment of the line continues the block. If it does, then the
prefix of the line is consumed and the process continues with the active
child block of the block. After going as far possible, the line is then
checked with a starts
procedure (stored in each block) to see if any
new block can be started.
TODO: According to the continuation line rules, block continuations consisting entirely of spaces must be placed to make sure a blank line is in the correct block, even though such text is not visible. The seemingly correct thing to do is to make the block continue on a blank line, but this cannot be done for every block (then blockquotes cannot be separated by blank lines).
The continuation could check for an empty line. Currently empty lines are
handled in a special way. They could be merged with the add-new-paragraph
handler, so that continuations could check for empty lines.
Inline Syntax
(This section is not yet implemented.)
CommonMark's rules for inline syntax are very complex, and the authors of it know:
By far the trickiest part of inline parsing is handling emphasis, strong emphasis, links, and images.
The inline matcher for Market has a list of inline syntax and the action to be taken after reading each inline syntax. Upon reading an inline syntax (disambiguated using greedy match), the stored procedure is then called with the rest of the line as an argument.
The parser treats *
, **
, and ***
as three separate cases. Combining
this with the greedy-match rule causes the following behavior:
-
Starting with
***
and mixing**
and*
is not allowed. Example:***bold and italic** just italic*
is not allowed.
-
Contrary to the original markdown syntax,
*
and_
can be combined. This allows for something like the above pattern:_**bold and italic** just italic_
or
*emphasized text _with emphasis in it_ and out of it*