| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Although lazy lines are explicitly not supported in Market, a restricted
version of lazy lines, lazy paragraph breaks, should be. In list elements,
the following:
1. test
~~ test
~~ test
paragraph
(where `~` is a visible space) should parse as
((list-item "test\ntest\n\ntest\n") (paragraph "paragraph\n"))
This is currently the case, but the current algorithm only looks at the
current line to determine what line should be associated with the text.
This is an issue because figuring out which element a lazy paragraph
break should associate with requires arbitrary lookahead:
1. test
test
should be one element: on the other hand,
1. test
test
should be one element, multiple paragraph breaks, and another element.
Possible solutions:
1. Remove lazy paragraph breaks. I do not like this because it makes multi
paragraph lists depend on non-visible characters.
2. Abandon line-by-line parsing. Not good, would require the entire buffer.
3. Save state with continuations. Would be very inefficient.
4. Save state in the record. Would make the algorithm more complex.
5. When a line is found that does not continue the element, remove the paragraph
breaks from the element and put them into the parent node.
This is definitely possible, as the parent node is in scope of `continues`
in the algorithm. This would require storing data in a way that makes the
last elements easily accessible. Possible solutions:
1. Reversed list, turn into a queue on success. Would require a recursive
descent to close child nodes, although it would never revisit a closed
node.
2. Reversed list, as-is. Sounds error prone.
3. Doubly linked lists. Difficult to turn into regular lists, which is what
the rest of Scheme uses.
I will probably do 5.1.
Note: I believe a much simpler parser could be made with delimited, composable
continuations. If the location of a paragraph break cannot be located, then
a delimited continuation can be captured, and more lines can be captured until
the disambiguating line is found, and then the continuation can process the
paragraph breaks. The continuation allows for accumulation of state without
explicitly putting it in a structure.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Although it is possible to incorporate automatic detection of list
containers in the block parser (look ahead for `* `, if not check
for ` `), but I think that this is premature.
The point of the block parser is to take the input and figure out
what block the item is in. All a list container does is compress
together adjacent list items. This can be done in a second pass.
(This might have the effect of causing list items separated by line
breaks to be in the same list. If there is nothing in between, it
would make sense.)
|