market - Pseudo-Markdown parser in Scheme

	Commit message (Collapse)	Author	Age	Files	Lines
*	Rewrite tests to conform to greedy behavior for paragraph breaksHEAD master	Peter McGoron	2025-02-04	2	-7/+15
\| \| \| \| \| \| \| \|	Paragraph breaks will be associated with the deepest child that will accept them. These paragraph breaks can be moved around in a post-processing pass over the tree (which must happen anyways to parse inline syntax). This also allows for child-type-dependent processing of paragraph breaks.
*	Change handling of paragraph breaks, compress tests	Peter McGoron	2025-02-02	3	-298/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Although lazy lines are explicitly not supported in Market, a restricted version of lazy lines, lazy paragraph breaks, should be. In list elements, the following: 1. test ~~ test ~~ test paragraph (where `~` is a visible space) should parse as ((list-item "test\ntest\n\ntest\n") (paragraph "paragraph\n")) This is currently the case, but the current algorithm only looks at the current line to determine what line should be associated with the text. This is an issue because figuring out which element a lazy paragraph break should associate with requires arbitrary lookahead: 1. test test should be one element: on the other hand, 1. test test should be one element, multiple paragraph breaks, and another element. Possible solutions: 1. Remove lazy paragraph breaks. I do not like this because it makes multi paragraph lists depend on non-visible characters. 2. Abandon line-by-line parsing. Not good, would require the entire buffer. 3. Save state with continuations. Would be very inefficient. 4. Save state in the record. Would make the algorithm more complex. 5. When a line is found that does not continue the element, remove the paragraph breaks from the element and put them into the parent node. This is definitely possible, as the parent node is in scope of `continues` in the algorithm. This would require storing data in a way that makes the last elements easily accessible. Possible solutions: 1. Reversed list, turn into a queue on success. Would require a recursive descent to close child nodes, although it would never revisit a closed node. 2. Reversed list, as-is. Sounds error prone. 3. Doubly linked lists. Difficult to turn into regular lists, which is what the rest of Scheme uses. I will probably do 5.1. Note: I believe a much simpler parser could be made with delimited, composable continuations. If the location of a paragraph break cannot be located, then a delimited continuation can be captured, and more lines can be captured until the disambiguating line is found, and then the continuation can process the paragraph breaks. The continuation allows for accumulation of state without explicitly putting it in a structure.
*	thematic breaks	Peter McGoron	2025-02-01	3	-10/+158
\|
*	testing unordered list items	Peter McGoron	2025-01-30	1	-27/+165
\|
*	use pipeline operators to write more comprehensive tests	Peter McGoron	2025-01-29	1	-303/+371
\|
*	basic tests for unordered list items	Peter McGoron	2025-01-27	2	-1/+31
\|
*	update documentation	Peter McGoron	2025-01-27	1	-12/+12
\|
*	fix spurious newlines	Peter McGoron	2025-01-27	3	-4/+20
\|
*	expand starts and continues by adding current node as argument	Peter McGoron	2025-01-27	5	-80/+91
\| \| \| \| \| \| \|	This simplifies the parser at the expense of moving the `add-new-node!` declaration into the `starts` procedure. This allows for the procedure to mutate the node when needed, which is needed for properly parsing pandoc-style grid tables.
*	indented code block test	Peter McGoron	2025-01-26	3	-14/+81
\|
*	partial suppor for ATX headings	Peter McGoron	2025-01-26	4	-8/+102
\|
*	fix tight nesting of block quotes	Peter McGoron	2025-01-26	5	-10/+61
\|
*	parsing tests	Peter McGoron	2025-01-26	3	-1/+150
\|
*	remove unordered list containers	Peter McGoron	2025-01-26	7	-70/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Although it is possible to incorporate automatic detection of list containers in the block parser (look ahead for `* `, if not check for ` `), but I think that this is premature. The point of the block parser is to take the input and figure out what block the item is in. All a list container does is compress together adjacent list items. This can be done in a second pass. (This might have the effect of causing list items separated by line breaks to be in the same list. If there is nothing in between, it would make sense.)
*	block structure	Peter McGoron	2025-01-25	10	-0/+891