228: A further comparator library

by Daphne Preston-Kendal

Status

This SRFI is currently in draft status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-228@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

Further procedures and syntax forms for defining SRFI 128 comparators, and for extracting comparison procedures similar to those defined for Scheme’s built-in types using them.

Best enjoyed in combination with SRFI 162.

Issues

Outstanding issues are noted inline, denoted by a highlighted background.

Rationale

A typical pattern for defining comparison behaviour for record types and other user-defined Scheme types is some form of lexicographical order over their member slots. The comparator abstraction of SRFI 128 has useful potential for reducing the code bulk of defining such comparison behaviours. This library is intended to take advantage of that potential.

Specification

All identifiers defined in this SRFI are exported in the (srfi 228) library. However, as in the case of SRFI 162, it is hoped and intended that they will simply be added to the existing comparator libraries of RnRSes and Scheme implementations, rather than being in a separate library.

(make-wrapper-comparator type-test unwrap contents-comparator) (Procedure)

Returns a comparator which compares values satisfying the predicate type-test by first calling the given unwrap procedure on them, then comparing the output of that procedure with the given contents-comparator. The hash function of the wrapper comparator returns the same value as the contents-comparator run on the unwrapped value.

(make-product-comparator comparator ... ) (Procedure)

Returns a comparator which compares values satisfying the type tests of all of the given comparators by comparing them with each of the given comparators in turn, left to right, and returning the result of the first non-equal comparison. If all the given comparators consider two values equal, the product comparator also considers them equal. The hash function of the product comparator hashes together the results of all the given comparators in an implementation-defined way. If the equality or ordering predicates or the hash function of any of the given comparators is #f, the corresponding procedure in the product comparator will also be #f.

(make-sum-comparator comparator ... ) (Procedure)

Returns a comparator which compares values satisfying the type-tests of any of the given comparators such that values which satisfy the type test of a given comparator are ordered before any values satisfying the type tests of any comparators that appear to the right of it, and values satisfying the same comparator’s type test are tested for ordering and equality according that comparator. The hash function of the sum comparator returns the value of the hash function of the leftmost comparator whose type test is satisfied by the given value. If the equality or ordering predicates or the hash function of any of the given comparators is #f, the corresponding procedure in the product comparator will also be #f.

(comparison-procedures comparator) (Procedure)

Returns five values, variadic procedures corresponding to <, <=, =, >=, and > respectively for the given comparator. Each one is equivalent to a partial application of the SRFI 128 procedures <?, <=?, =?, >=?, and >? with the given comparator.

This is admittedly a rather large number of return values. I hope that that R7RS will adopt some kind of keyword argument system, capable of being extended to keyword return values, and that it will then be possible to redefine comparison-procedures to return procedures associated with keyword names to make the order irrelevant and code using this procedure less opaque. In the meanwhile …?

As long as we’re stuck with positional return values, I should note that this isn’t the order usually used for the specification of comparison procedures (e.g. the char*? family in R7RS small uses the order char=?, char<?, char>?. char<=?, char>=?). I find the order here easier to remember, but perhaps it would be better to switch to that order for consistency.

Examples

Personal names are usually sorted lexicographically and case-insensitively by the last name, then the first name if the last names are the same. The following example shows how make-wrapper-comparator and make-product-comparator can be used to create a comparator which orders a record type of personal names in this way, and how comparison-procedures can then be used for this ordering.

(define-record-type Person
  (make-person first-name last-name)
  person?
  (first-name person-first-name)
  (last-name person-last-name))

(define person-name-comparator
  (make-product-comparator
   (make-wrapper-comparator person? person-last-name string-ci-comparator)
   (make-wrapper-comparator person? person-first-name string-ci-comparator)))

(define-values (person-name<?
                person-name<=?
                person-name=?
                person-name>?
                person-name>=?)
  (comparison-procedures person-name-comparator))

(person-name<? (make-person "John" "Cowan")
               (make-person "Daphne" "Preston-Kendal")) ;=> #t

(person-name>? (make-person "Tom" "Smith")
               (make-person "John" "Smith")) ;=> #t

This example can be extended with nested comparators to sort a catalogue in which CDs appear at the end of the list after books:

(define-record-type Book
  (make-book author title)
  book?
  (author book-author)
  (title book-title))

(define book-comparator
  (make-product-comparator
   (make-wrapper-comparator book? book-author person-name-comparator)
   (make-wrapper-comparator book? book-title string-ci-comparator)))

(define-record-type CD
  (make-cd artist title)
  cd?
  (artist cd-artist)
  (title cd-title))

(define cd-comparator
  (make-product-comparator
   (make-wrapper-comparator cd? cd-artist person-name-comparator)
   (make-wrapper-comparator cd? cd-title string-ci-comparator)))

(define item-comparator
  (make-sum-comparator book-comparator cd-comparator))

(define-values (item<?
                item<=?
                item=?
                item>?
                item>=?)
  (comparison-procedures item-comparator))

(list-sort item<?
           (list (make-cd (make-person "The" "Beatles") "Abbey Road")
                 (make-book (make-person "Jacob" "Grimm") "Deutsche Grammatik")
                 (make-book (make-person "William" "Shakespeare") "Sonnets")
                 (make-book (make-person "William" "Shakespeare") "A Midsummer Night’s Dream")
                 (make-cd (make-person "Bob" "Dylan") "Blonde on Blonde")
                 (make-cd (make-person "The" "Beatles") "Revolver")))
;; => ({Book {Person "Jacob" "Grimm"} "Deutsche Grammatik"}
;;     {Book {Person "William" "Shakespeare"} "A Midsummer Night’s Dream"}
;;     {Book {Person "William" "Shakespeare"} "Sonnets"}
;;     {CD {Person "The" "Beatles"} "Abbey Road"}
;;     {CD {Person "The" "Beatles"} "Revolver"}
;;     {CD {Person "Bob" "Dylan"} "Blonde on Blonde"})

Implementation

A portable sample implementation and test suite for R7RS may be found in the repository for this SRFI.

Acknowledgements

make-sum-comparator, and by extension the inspiration for the name of make-product-comparator, is originally from Adam Nelson’s Schemepunk.

Marc Nieper-Wißkirchen and John Cowan suggested means of improving the comparison-procedures procedure; as of the current draft, this remains an open issue.

Future work

The author hopes that a future SRFI will add a procedure for creating comparators yielding lexicographical order over any sequence type by delegating to a common iteration protocol. An idea to do this using fold procedures foundered on two grounds: the first and more intrinsic one is that fold called on two sequences, one of which is a prefix of the other, cannot determine which of the two is longer, and a sort using fold-based iteration would incorrectly consider them equal; the second is that there is currently an inconsistency among Scheme libraries in what order the kons procedure argument to fold receives the accumulator and the next values in the sequence (compare SRFI 1 fold with SRFI 133 vector-fold). SRFI 158 generators were rejected on the ground that their sequences cannot contain any arbitrary Scheme datum.

© 2021 Daphne Preston-Kendal.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice (including the next paragraph) shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Editor: Arthur A. Gleckler