HaScheme -- Call By Name Scheme

Scheme demonstrates that a very small number of rules for forming expressions, with no restrictions on how they are composed, suffice to form a practical and efficient programming language that is flexible enough to support most of the major programming paradigms in use today.

-- Revisedⁿ Reports on the Algorithmic Language Scheme (n ≥ 3, 1986–)

HaScheme is a library that implements a subset of the R⁷RS in a lazy way. When HaScheme is used by itself, it is a lazy functional programming language with the same syntax as Scheme, embedded within Scheme.

For example, the following map implementation is both valid HaScheme (importing (hascheme base)) and Scheme (importing (scheme base)):

(define (map f list)
  (if (null? list)
      '()
      (cons (f (car list)) (map f (cdr list)))))

Since HaScheme is lazy, this function will return a promise. The function's body is only executed if, when attempting to force a promise containing the map, the contents of the map are needed for a computation.

Procedures in HaScheme can be written in ways that look identical to regular Scheme. They do not need to be explicitly delayed or delay-forceed, and they only need to force to reduce space usage. HaScheme procedures return promises which can be forced by regular code. HaScheme's datatypes are the same as regular Schemes. Hence lazy and non-lazy code can co-exist. It is easy to wrap eager Scheme procedures to be usable in HaScheme.

Every procedure in HaScheme is lazy. Values are forced in conditionals, or explicitly using seq. This allows for the call-by-value semantics of Scheme to be turned into call-by-need semantics without any syntactic cruft.

HaScheme should run in any implemention of the R⁷RS that supports the SRFI 259. It does not use (scheme lazy).

Why use this?

To have fun playing around with functional infinite data structures.
To embed lazy and pure algorithms into impure Scheme with ease.
To show those dirty Haskellers that you don't need no stinkin' static type system.

Restrictions and Implementation Notes

No call/cc. Explanation
No call-with-values or multiple-valued returns. Explanation
No exceptions (but error exists).
Strings and bytevectors are eager objects. For example, forcing a string will also force any characters and strings used to build the string.
Pairs and vectors are lazy objects. Forcing a pair will not force its components.
No mutation and no I/O (i.e. no ports).
Parameters are not supported because forcing a promise uses the parameters of the dynamic extent of the force, and not the dynamic extent of the delay. This makes them useless in this context. This would be fixed by SRFI 226.
No quasiquote.

Fun (or Pain) with Laziness

You need to be careful with lazy functions because they can cause space leaks. This is a problem in general with lazy languages (like in Haskell). Here is an example:

(define (list-tail list n)
  (if (zero? n)
      list
      (list-tail (cdr list) (- n 1))))

Thunks will build up over time in the list, so it must be forced.

(define (list-tail list n)
  (if (zero? n)
      list
      (list-tail (force (cdr list)) (- n 1))))

Note that n is never explicitly forced: it is implicitly forced by the control flow.

The first code block has the attractive property that it operates the same way on finite lists in both Scheme and HaScheme, while the second one could differ in exotic cases (like promises that return promises). Instead of writing force, the operator ! is used:

(define (list-tail list n)
  (if (zero? n)
      list
      (list-tail (! (cdr list)) (- n 1))))

where (! x) is defined to just be x in Scheme. Now the code block above operates the same in Scheme and HaScheme.

Ok, now we have fixed our space leak issues. Right? Let's try another infinite list trick: a list of all natural numbers.

(define naturals (list-tabulate +inf.0 (lambda (x) x)))
(! (list-tail naturals 1000000000))

This also leaks! This is because the promises are making new cons cells, and storing them in naturals. We need to organize things to make sure the program can clean up.

(! (list-tail (list-tabulate +inf.0 (lambda (x) x)) 1000000000))

This will run in bounded space.

Call-by-Need and Conditionals

Since call-by-need will only execute a function when needed, conditional forms like if can be implemented as functions and not syntax. In fact, HaScheme implements if, and, or, and the cond-like cond* as functions, meaning one can pass them around as values.

For instance:

(define (map f l)
  (cond
    ((null? l) '())
    ((pair? l) (cons (f (car l)) (cdr l)))
    (else (error "not a list" l))))

implemented with cond* is

(define (map f l)
  (cond*
   (null? l) '()
   (pair? l) (cons f (car l) (cdr l))
   #t (error "not a list" l)))

Neat, right? Well, if we go to list-tail we have a problem:

(define (list-tail list n)
  (if (zero? n)
      list
      (list-tail (! (cdr list)) (- n 1))))

Since if is now a function, Scheme (our call-by-value host language) will attempt to reduce (! (cdr list)) every time, even when we don't need to. We could go back to syntactic if, or we could add some wrapper to the procedure. The seq function (named after the function in Haskell) takes n forms, forces the first n-1, and returns the nth form.

(define (list-tail list n)
  (if* (zero? n)
       list
       (seq (cdr list)
            (list-tail (cdr list) (- n 1)))))

Multiple Values and Continuations

HaScheme doesn't have call/cc. call/cc is not a function because it does not return, so that's strike one for inclusion in a pure language. Reified continuations make sense in a call-by-value language, because there is a definite evaluation order (innermost first), but a lazy language can execute any code at basically any time.

A future implementation might be able to use SRFI-226's delimited control structures to implement continuations, because they are actual functions.

Multiple values are specified as returning values to their continuation. Since HaScheme does not (conceptually) have continuations, multiple values have to be interpreted differently. But a bigger issue occurs because a promise is a single value. It cannot be decomposed into more values without forcing the promise. Multiple value returns are simulated using lists, although vectors could also work.

Why `delay` and `delay-force` Are Not Enough

Scheme for a long time had delay and force that were never implemented very well. It was only in the R⁷RS that they were implemented in a safe-for-space way. However, the usual transformation does not handle higher-order procedures correctly.

For example, consider the transformation advocated in SRFI 45. The map function, defined as

(define (map f lst)
  (if (null? lst)
      '()
      (cons (f (car lst)) (map f (cdr lst)))))

becomes

(define (map f lst)
  (delay-force
    (if (null? (force lst))
        (delay '())
        (delay (cons (f (car (force lst)))
                     (map f (cdr (force lst))))))))

So far, so good. But let us define

(define (add-n n)
  (lambda (x)
    (+ x n)))

which then becomes

(define (add-n n)
  (delay-force
    (lambda (x)
      (delay-force (+ (force x) (force n))))))

If we evaluated, in normal Scheme,

(map (add-n 5) '(1 2 3 4))

we would get (6 7 8 9) back. But if we evaluated this in our lazy language, we would get an error because we tried to apply arguments to a non-procedure.

We could get around this by annotating each higher-order procedure with force. But this violates one of the principles of HaScheme, which is that the code should look natural.

Instead, this library uses SRFI 259 tagged procedures to wrap promises. This allows for arbitrary higher-order procedures expressed in a natural way. The R7RS delay/delay-force/force expressions are re-implemented using the tagged procedures.