aboutsummaryrefslogtreecommitdiffstats
path: root/README.md
blob: b098a4139ae199054a7a7795e3821bb77b24b3dd (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
# HaScheme -- Call By Name Scheme

> Scheme demonstrates that a very small number of rules for forming
> expressions, with no restrictions on how they are composed, suffice to
> form a practical and efficient programming language that is flexible
> enough to support most of the major programming paradigms in use today.
>
> -- Revisedⁿ Reports on the Algorithmic Language Scheme (n ≥ 3, 1986–)

HaScheme is a library that implements a subset of the R⁷RS in a lazy
way. When HaScheme is used by itself, it is a lazy functional programming
language with the same syntax as Scheme, embedded within Scheme.

For example, the following `map` implementation is both valid HaScheme
(importing `(hascheme base)`) and Scheme (importing `(scheme base)`):

    (define (map f list)
      (if (null? list)
          '()
          (cons (f (car list)) (map f (cdr list)))))

Since HaScheme is lazy, this function will return a promise. The
function's body is only executed if, when attempting to `force`
a promise containing the map, the contents of the map are needed for
a computation.

Procedures in HaScheme can be written in ways that look
identical to regular Scheme. They do not need to be explicitly `delay`ed
or `delay-force`ed, and they only need to `force` to reduce space usage.
HaScheme procedures return promises which can be forced by regular
code. HaScheme's datatypes are the same as regular Schemes. Hence lazy
and non-lazy code can co-exist. It is easy to wrap eager Scheme procedures
to be usable in HaScheme.

*Every* procedure in HaScheme is lazy. Values are forced in conditionals,
or explicitly using `seq`. This allows for the call-by-value semantics
of Scheme to be turned into call-by-need semantics without any syntactic
cruft.

HaScheme should run in any implemention of the R⁷RS that supports the
[SRFI 259][SRFI-259]. It does not use `(scheme lazy)`.

Why use this?

1. To have fun playing around with functional infinite data structures.
2. To embed lazy and pure algorithms into impure Scheme with ease.
3. To show those dirty Haskellers that you don't need no stinkin'
   static type system.

See also [Lazy Racket][LazyRacket].

[LazyRacket]: https://docs.racket-lang.org/lazy/index.html

HaScheme is licensed under the 0BSD license.

HaScheme is tested to work in CHICKEN 6.0.0 latest master.

## Restrictions and Implementation Notes

1. No `call/cc`. [Explanation](#multiple-values-and-continuations)
2. No `call-with-values` or multiple-valued returns.
   [Explanation](#multiple-values-and-continuations)
3. No exceptions (but `error` exists).
4. Strings and bytevectors are eager objects. For example, forcing a
   string will also force any characters and strings used to build
   the string.
5. Pairs and vectors are lazy objects. Forcing a pair will not force its
   components.
6. No mutation and no I/O (i.e. no ports).
7. Parameters are not supported because forcing a promise uses the
   parameters of the dynamic extent of the force, and not the dynamic
   extent of the delay. This makes them useless in this context.
   This would be fixed by SRFI 226.
8. No quasiquote.

## Fun (or Pain) with Laziness

You need to be careful with lazy functions because they can cause
space leaks. This is a problem in general with lazy languages ([like
in Haskell][HaskellFolds]). Here is an example:

[HaskellFolds]: https://wiki.haskell.org/Foldr_Foldl_Foldl%27

    (define (list-tail list n)
      (if (zero? n)
          list
          (list-tail (cdr list) (- n 1))))

Thunks will build up over time in the list, so it must be forced.

    (define (list-tail list n)
      (if (zero? n)
          list
          (list-tail (force (cdr list)) (- n 1))))

Note that `n` is never explicitly forced: it is implicitly forced by the
control flow.

The first code block has the attractive property that it operates the
same way on finite lists in both Scheme and HaScheme, while the second
one could differ in exotic cases (like promises that return promises).
Instead of writing `force`, the operator `!` is used:

    (define (list-tail list n)
      (if (zero? n)
          list
          (list-tail (! (cdr list)) (- n 1))))

where `(! x)` is defined to just be `x` in Scheme. Now the code block
above operates the same in Scheme and HaScheme.

Ok, now we have fixed our space leak issues. Right? Let's try another
infinite list trick: a list of all natural numbers.

    (define naturals (list-tabulate +inf.0 (lambda (x) x)))
    (! (list-tail naturals 1000000000))

This also leaks! This is because the promises are making new cons cells,
and storing them in `naturals`. We need to organize things to make sure
the program can clean up.

    (! (list-tail (list-tabulate +inf.0 (lambda (x) x)) 1000000000))

This will run in bounded space.

## Call-by-Need and Conditionals

Since call-by-need will only execute a function when needed, conditional
forms like `if` can be implemented as functions and not syntax. In fact,
HaScheme implements `if`, `and`, `or`, and the `cond`-like `cond*` as
functions, meaning one can pass them around as values.

For instance:

    (define (map f l)
      (cond
        ((null? l) '())
        ((pair? l) (cons (f (car l)) (cdr l)))
        (else (error "not a list" l))))

implemented with `cond*` is

    (define (map f l)
      (cond*
       (null? l) '()
       (pair? l) (cons f (car l) (cdr l))
       #t (error "not a list" l)))

Neat, right? Well, if we go to `list-tail` we have a problem:

    (define (list-tail list n)
      (if (zero? n)
          list
          (list-tail (! (cdr list)) (- n 1))))

Since `if` is now a function, Scheme (our call-by-value host language)
will attempt to reduce `(! (cdr list))` every time, even when we don't
need to. We could go back to syntactic if, or we could add some wrapper
to the procedure. The `seq` function (named after the function in Haskell)
takes `n` forms, forces the first `n-1`, and returns the `n`th form.

    (define (list-tail list n)
      (if* (zero? n)
           list
           (seq (cdr list)
                (list-tail (cdr list) (- n 1)))))

## Multiple Values and Continuations

HaScheme doesn't have `call/cc`. `call/cc` is not a function because it
does not return, so that's strike one for inclusion in a pure language.
Reified continuations make sense in a call-by-value language, because
there is a definite evaluation order (innermost first), but a lazy
language can execute any code at basically any time.

A future implementation might be able to use SRFI-226's delimited
control structures to implement continuations, because they are actual
functions.

Multiple values are specified as returning values to their continuation.
Since HaScheme does not (conceptually) have continuations, multiple
values have to be interpreted differently. But a bigger issue occurs
because a promise is a single value. It cannot be decomposed into more
values without forcing the promise. Multiple value returns are simulated
using lists, although vectors could also work.

## Why `delay` and `delay-force` Are Not Enough

Scheme for a long time had `delay` and `force` that were never implemented
very well. It was only in the R⁷RS that they were implemented in a
safe-for-space way. However, the usual transformation does not handle
higher-order procedures correctly.

For example, consider the transformation advocated in [SRFI 45][SRFI-45].
The `map` function, defined as

    (define (map f lst)
      (if (null? lst)
          '()
          (cons (f (car lst)) (map f (cdr lst)))))

becomes

    (define (map f lst)
      (delay-force
        (if (null? (force lst))
            (delay '())
            (delay (cons (f (car (force lst)))
                         (map f (cdr (force lst))))))))

So far, so good. But let us define

    (define (add-n n)
      (lambda (x)
        (+ x n)))

which then becomes

    (define (add-n n)
      (delay-force
        (lambda (x)
          (delay-force (+ (force x) (force n))))))

If we evaluated, in normal Scheme,

    (map (add-n 5) '(1 2 3 4))

we would get `(6 7 8 9)` back. But if we evaluated this in our lazy
language, we would get an error because we tried to apply arguments to
a non-procedure.

We could get around this by annotating each higher-order procedure with
`force`. But this violates one of the principles of HaScheme, which is that
the code should look natural.

Instead, this library uses [SRFI 259][SRFI-259] tagged procedures to wrap
promises. This allows for arbitrary higher-order procedures expressed
in a natural way. The R7RS `delay`/`delay-force`/`force` expressions
are re-implemented using the tagged procedures.

Each `lambda` creates a procedure object that can be called with an
arbitrary number of arguments to create another procedure. Procedure
argument length is only checked when the procedure is forced.

[SRFI-45]: https://srfi.schemers.org/srfi-45
[SRFI-259]: https://srfi.schemers.org/srfi-259