A dialect of Markdown that is easier to parse
Find a file
Peter McGoron cbb20bcbeb Rewrite tests to conform to greedy behavior for paragraph breaks
Paragraph breaks will be associated with the deepest child that
will accept them. These paragraph breaks can be moved around in
a post-processing pass over the tree (which must happen anyways to
parse inline syntax). This also allows for child-type-dependent
processing of paragraph breaks.
2025-02-04 10:26:28 -05:00
market Rewrite tests to conform to greedy behavior for paragraph breaks 2025-02-04 10:26:28 -05:00
mcgoron/srfi block structure 2025-01-25 15:37:19 -05:00
tests Rewrite tests to conform to greedy behavior for paragraph breaks 2025-02-04 10:26:28 -05:00
COPYING block structure 2025-01-25 15:37:19 -05:00
prelude.scm remove unordered list containers 2025-01-26 18:35:17 -05:00
README.md expand starts and continues by adding current node as argument 2025-01-27 06:44:44 -05:00

Market

Market is an opinionated dialect of Markdown written in portable Scheme. It is designed to be easy to extend and understand.

It is not a CommonMark parser. For a Scheme CommonMark parser, see Guile CommonMark.

Block Syntax

Market uses significant indentation to make documents more regular and to make parsing easier.

Market parse line by line, similar to CommonMark. It also parses blocks before inline syntax. To parse blocks:

  1. The beginning of each line is checked for the correct indentation recursively from the top of the document tree. (Unlike other Markdowns, lazy lines are not allowed.) Indentation is usually space, but can include >, |, etc.
  2. The rest of the line is checked for a block start.
  3. If there is no block start, then it is a paragraph line.

In practice, each block has a continues procedure that checks if the current fragment of the line continues the block. If it does, then the prefix of the line is consumed and the process continues with the active child block of the block. After going as far possible, the line is then checked with a starts procedure (stored in each block) to see if any new block can be started.

TODO: According to the continuation line rules, block continuations consisting entirely of spaces must be placed to make sure a blank line is in the correct block, even though such text is not visible. The seemingly correct thing to do is to make the block continue on a blank line, but this cannot be done for every block (then blockquotes cannot be separated by blank lines).

The continuation could check for an empty line. Currently empty lines are handled in a special way. They could be merged with the add-new-paragraph handler, so that continuations could check for empty lines.

Inline Syntax

(This section is not yet implemented.)

CommonMark's rules for inline syntax are very complex, and the authors of it know:

By far the trickiest part of inline parsing is handling emphasis, strong emphasis, links, and images.

The inline matcher for Market has a list of inline syntax and the action to be taken after reading each inline syntax. Upon reading an inline syntax (disambiguated using greedy match), the stored procedure is then called with the rest of the line as an argument.

The parser treats *, **, and *** as three separate cases. Combining this with the greedy-match rule causes the following behavior:

  1. Starting with *** and mixing ** and * is not allowed. Example:

    ***bold and italic** just italic*
    

    is not allowed.

  2. Contrary to the original markdown syntax, * and _ can be combined. This allows for something like the above pattern:

    _**bold and italic** just italic_
    

    or

    *emphasized text _with emphasis in it_ and out of it*