225: Dictionaries

by John Cowan (spec) and Arvydas Silanskas (implementation)

Status

This SRFI is currently in draft status. Here is an explanation of each status that a SRFI can hold. To provide input on this SRFI, please send email to srfi-225@nospamsrfi.schemers.org. To subscribe to the list, follow these instructions. You can access previous messages via the mailing list archive.

Abstract

The procedures of this SRFI allow callers to manipulate an object that maps keys to values without the caller needing to know exactly what the type of the object is. Such an object is called a dictionary in this SRFI.

Issues

dict=?, dict<?, etc. Compare dictionaries for equality, subset, etc. A value comparator is passed in.

dict-union(!), dict-intersection(!), dict-difference(!), dict-xor(!).

dict-range=(!), dict-range<(!), etc. Return subsets whose keys are =, <, etc. of a provided value.

dict-open-mapping, dict-closed mapping, dict-open-closed-mapping, dict-closed-open-mapping. Returns subsets whose keys are in a certain interval specified by upper and lower bounds.

dict-min-key, dict-max-key: Returns the smallest and largest key in the dictionary.

dict-key-successor, dict-key-predecessor: Return preceding and following key.

Rationale

Until recently there was only one universally available mechanism for managing key-value pairs: alists. Most Schemes also support hash tables, but until R6RS there was no standard interface to them, and many implementations do not provide that interface.

Now, however, the number of such mechanisms is growing. In addition to both R6RS and R7RS hash tables, there are persistent ordered and hashed mappings from SRFI 146 and ordered key-value stores (often on a disk or a remote machine) from SRFI 167.

It’s inconvenient for users if SRFIs or other libraries have to insist on accepting only a specific type of dictionary. This SRFI exposes a number of accessors, mutators, and other procedures that can be called on any dictionary, provided that a dictionary type descriptor (DTD, with apologies to SGML and XML users) is available for it: either exported from this SRFI, or from other SRFIs or libraries, or created by the user. DTDs are of an unspecified type.

This in turn requires that the DTD provides a predicate that can recognize its dictionaries, plus at least these primitive operations: create an empty dictionary from a comparator; determine a dictionary’s current size; reference, update, or insert an element of the dictionary depending on its current contents; and process all the elements using a procedure invoked for its side effects.

By using the procedures of this SRFI, a procedure can take a DTD and a dictionary as an argument and make flexible use of the dictionary without knowing its exact type.

Note that dictionaries must still be constructed using type-specific constructors, as the required and optional arguments differ in each case.

Specification

We call a specific key-value combination an association. This is why an alist, or association list, is called that; it is a list of associations represented as pairs.

A dictionary is a collection of associations which may be ordered or unordered. In principle an equality predicate is enough, given a key, to find if an association with that key exists in the dictionary. However, for efficiency most dictionaries require an ordering predicate or a hash function as well.

When a key argument is said to be the same as some key of the dictionary, it means that they are the same in the sense of the dictionary’s implicit or explicit equality predicate. It is assumed that no dictionary contains two keys that are the same in this sense. Two dictionaries are similar if they are of the same type and have the same equality predicate and the same ordering predicate and/or hash function.

Lists as dictionaries

Lists are supported as dictionaries using the specification in this section. If two keys are the same (in the sense of the specified equality predicate), then all but the first are treated as if they did not exist.

If plist-dtd (see below) is used with a list, then the list is assumed to be a property list, alternating symbol keys with values. The linear-update mutation operations actually mutate the property list whenever possible. The equality predicate of this type of dictionary is eq?.

If a list is empty or its car is a pair, then the list can be treated as an alist. New values are added to the beginning of an alist and the new alist is returned; linear-update operations do not mutate the alist, but return an alist that may or may not share storage with the original alist. If an association has been updated, then both the new and the old association may be processed by the whole-dictionary procedures. The examples in this SRFI use alists.

However, an alist (unlike a hashtable or mapping) does not know which equality predicate its users intend to use on it. Therefore, rather than exporting a single DTD for alists, this SRFI provides a procedure make-alist-dtd that takes an equality predicate and returns a DTD specialized for manipulation of alists using that predicate.

In all other cases, lists cannot be treated as dictionaries unless a DTD exists.

Procedures

Each of the following examples is assumed to be prefixed by the following definitions:

(define dict (list '(1 . 2) '(3 . 4) '(5 . 6)))
(define aed alist-eqv-dtd)
Consequently, previous examples don't affect later ones.

The dtd argument is not discussed in the individual procedure descriptions below, but it is an error if invoking the dictionary? procedure that was used to make the DTD on dictionary returns #f.

Constructor

(make-dictionary dtd comparator)

Returns an empty dictionary of the type described by the DTD using comparator to specify the dictionary's equality predicate and its ordering predicate and/or hash function.

If the contents of comparator are inconsistent with the dictionary type, it is an error. If the dictionary type does not accept a comparator, #f should be passed instead.

(dict-unfold dtd comparator stop? mapper successor seed)

Create a new dictionary as if by make-dictionary using dtd and comparator. If the result of applying the predicate stop? to seed is true, return the dictionary. Otherwise, apply the procedure mapper to seed. Mapper returns two values, which are inserted into the dictionary as the key and the value respectively. Then get a new seed by applying the procedure successor to seed, and repeat this algorithm.

Predicates

(dictionary? dtd obj)

Returns #t if obj answers #t to the type predicate stored in the DTD, and #f otherwise.

(dictionary? aed dict) => #t

(dict-empty? dtd dict)

Returns #t if dictionary contains no associations and #f if it does contain associations.

(dict-empty? aed '()) => #t
(dict-empty? aed dict) => #f

(dict-contains? dtd dictionary key)

(dict-contains? aed dict 1) => #t
(dict-contains? aed dict 2) => #f

Returns #t if one of the keys of dictionary is the same as key, and #f otherwise.

Lookup

(dict-ref dtd dictionary key [failure [success] ])

If key is the same as some key of dictionary, then invokes success on the corresponding value and returns its result. If key is not a key of dictionary, then invokes the thunk failure and returns its result. The default value of failure signals an error; the default value of success is the identity procedure.

(dict-ref aed dict 1 (lambda () '()) list) => (1) ; Success wraps value in a list
(dict-ref aed dict 2 (lambda () '()) list) => ()  ; Failure returns empty list

(dict-ref/default dtd dictionary key default)

If key is the same as some key of dictionary then returns the corresponding value. If not, then returns default.

(dict-ref/default aed dict 1 #f) => 1
(dict-ref/default aed dict 1 #f) => #f

Mutation

All these procedures exist in pairs, with and without a final !. The ones without ! are functional: they never side-effect their arguments, but may copy them. The ones with ! are linear-update: they may return either a new dictionary object (which may or may not share storage with the dictionary argument), or the same dictionary object, mutated. In either case it is an error to access the dictionary later through any other reference to it, as that reference may have been invalidated. These procedures are defined as linear-update because the underlying type-specific operations may be functional, mutational, or themselves linear-update.

(dict-set dtd dictionary obj)
(dict-set! dtd dictionary obj)

Returns a dictionary that contains all the associations of dictionary plus those specified by objs, which alternate between keys and values. If a key to be added already exists in dictionary, the new value prevails.

; new values are prepended
(dict-set aed dict 7 8) => ((7 . 8) (1 . 2) (3 . 4) (5 . 6))
(dict-set aed dict 3 5) => ((3 . 5) (1 . 2) (3 . 4) (5 . 6)

(dict-adjoin dtd dictionary objs)
(dict-adjoin! dtd dictionary objs)

Returns a dictionary that contains all the associations of dictionary plus those specified by objs, which alternate between keys and values. If a key to be added already exists in dictionary, the old value prevails.

; new values are prepended to alists
(dict-adjoin aed dict 7 8) => ((7 . 8) (1 . 2) (3 . 4) (5 . 6))
(dict-adjoin aed dict 3 5) => ((1 . 2) (3 . 4) (5 . 6)

(dict-delete dtd dictionary key)
(dict-delete! dtd dictionary key)

Returns a dictionary that contains all the associations of dictionary except those whose keys are the same as one of the keys.

; new values are prepended
(dict-delete aed dict 1 3) => ((5. 6)) ; may share
(dict-delete aed dict 5) => ((1 . 2) (3 . 4)

(dict-delete-all dtd dictionary keylist)
(dict-delete-all! dtd dictionary keylist)

Returns a dictionary with all the associations of dictionary except those whose keys are the same as some member of keylist.

(dict-delete-all aed dict '(1 3)) => ((5 . 6))

(dict-replace dtd dictionary key value)
(dict-replace! dtd dictionary key value)

Returns a dictionary that contains all the associations of dictionary except as follows: If key is the same as a key of dictionary, then the association for that key is omitted and replaced by the association defined by the pair key and value. If there is no such key in dictionary, then dictionary is returned unchanged.

(dict-replace aed dict 1 3) => ((1 . 3) (1 . 2) (3 . 4) (5 . 6)) 

(dict-intern dtd dictionary key failure)
(dict-intern! dtd dictionary key failure)

If there is a key in dictionary that is the same as key, returns two values, dictionary and the value associated with key. Otherwise, returns two values, a dictionary that contains all the associations of dictionary and in addition a new association that maps key to the result of invoking failure, and the result of invoking failure.

(dict-intern aed dict 1 (lambda () #f)) => ; 2 values
  ((1 . 2) (3 . 4) (5 . 6))
  3
(dict-intern aed dict 2 (lambda () #f)) => ; 2 values
  ((2 . #f) (1 . 2) (3 . 4) (5 . 6))
  #f

(dict-update dtd dictionary key updater [failure [success] ])
(dict-update! dtd dictionary key updater [failure [success] ])

Retrieves the value of key as if by dict-ref, invokes updater on it, and sets the value of key to be the result of calling updater as if by dict-set, but may do so more efficiently. Returns the updated dictionary. The default value of failure signals an error; the default value of success is the identity procedure.

(dict-update/default dtd dictionary key updater default)
(dict-update/default! dtd dictionary key updater default)

Retrieves the value of key as if by dict-ref/default, invokes updater on it, and sets the value of key to be the result of calling updater as if by dict-set, but may do so more efficiently. Returns the updated dictionary.

(dict-pop dtd dictionary)
(dict-pop! dtd dictionary)

Chooses an association from dictionary and returns three values: a dictionary that contains all associations of dictionary except the chosen one, and the key and the value of the association chosen. If the dictionary is ordered, the first association is chosen; otherwise the chosen association is arbitrary.

If dictionary contains no associations, it is an error.

(dict-pop aed dict) => # 3 values
  ((3 . 4) (5 . 6))
  1
  2

(dict-map dtd proc dictionary)
(dict-map! dtd proc dictionary)

Returns a dictionary similar to dictionary that maps each key of dictionary to the value that results from invoking proc on the corresponding key and value of dictionary.

(dict-map (lambda (k v) (cons v k)) aed dict) => ((2 . 1) (4 . 3) (6 . 5))
(dict-map! (lambda (k v) (cons v k)) aed dict) => ((2 . 1) (4 . 3) (6 . 5))

(dict-filter dtd pred dictionary)
(dict-filter! dtd pred dictionary)
(dict-remove dtd pred dictionary)
(dict-remove! dtd pred dictionary)

Returns a dictionary similar to dictionary that contains just the associations of dictionary that satisfy / do not satisfy pred when it is invoked on the key and value of the association.

(dict-filter (lambda (k v) (= k 1)) aed dict) => ((1 . 2))
(dict-remove (lambda (k) (= k 1)) aed dict) => ((3 . 4) (5 . 6))

(dict-search dtd dictionary key failure success)
(dict-search! dtd dictionary key failure success)

This procedure is a workhorse for dictionary lookup, insert, and delete. The dictionary dictionary is searched for an association whose key is the same as key in the sense of dictionary. If one is not found, then the failure procedure is tail-called with two procedure arguments, insert and ignore, and is expected to tail-call one of them.

However, if such an association is found, then the success procedure is tail-called with the matching key of dictionary, the associated value, and two procedure arguments, update and remove, and is expected to tail-call one of them.

It is an error if the procedure arguments are invoked other than in tail position in the failure and success procedures. It is also an error if the failure and success procedures return to their implicit continuation without invoking one of their arguments.

The behaviors of the procedures are as follows (where obj is any Scheme object):

In all cases, obj is returned as a second value.

Here are four examples of dict-search, one for each of the four procedures:

     ;; ignore
     (define-values
       (dict value)
       (dict-search (alist->dict '((a . b))) 'c
              (lambda (insert ignore)
                (ignore 'foo))
              (lambda args
                (error))))
     (dict->alist aed dict)) => ((a . b))
     value => 'foo

     ;; insert
     (define-values
       (dict value)
       (dict-search (alist->dict '((a . b))) 'c
              (lambda (insert ignore)
                (insert 'd 'foo))
              (lambda args
                (error))))
     (dict-ref aed dict 'a)) => b
     (dict-ref aed dict 'c)) => 'd
     value => foo

     ;; update
     (define-values
       (dict value)
       (dict-search (alist->dict '((a . b))) 'a
              (lambda args
                (error))
              (lambda (key value update delete)
                (update 'a2 'b2 'foo))))
     (dict->alist aed dict) => ((a2 . b2)
     value => foo

     ;; delete
     (define-values
       (dict value)
       (dict-search (alist->dict '((a . b) (c . d))) 'a
              (lambda args
                (error))
              (lambda (key value update delete)
                (delete 'foo))))
     (dict->alist aed dict)) => ((c . d))
     value => foo

The whole dictionary

(dict-copy dtd dictionary)

Returns a dictionary that is similar to dictionary and contains the same associations. Mutating the result of dict-copy does not affect dictionary or vice versa.

(dict-size dtd dictionary)

Returns an exact integer representing the number of associations in dictionary.

(dict-size aed dict) => 3

(dict-for-each dtd proc dictionary)

Invokes proc on each key of dictionary and its corresponding value in that order. This procedure is used for doing operations on the whole dictionary. If the dictionary type is inherently ordered, associations are processed in the inherent order; otherwise in an arbitrary order. Returns an unspecified value.

(define (write-key key value) (write key))
(dict-for-each write-key aed dict) => unspecified
  ; writes "135" to current output

(dict-count dtd pred dictionary)

Passes each association of dictionary as two arguments to pred and returns the number of times that pred returned true as an an exact integer.

(dict-count aed dict (lambda (k v) (even? k))) => 0

(dict-any dtd pred dictionary)

Passes each association of dictionary as two arguments to pred and returns the value of the first call to pred that returns true, after which no further calls are made. If the dictionary type is inherently ordered, associations are processed in the inherent order; otherwise in an arbitrary order. If all calls return false, dict-any returns false.

(define (both-even? k v) (and (even? k) (even? v)))
(dict-any both-even? '((2 . 4) (3 . 5))) => #t
(dict-any both-even? '((1 . 2) (3 . 4))) => #f

(dict-every dtd pred dictionary)

Passes each association of dictionary as two arguments to pred and returns #f after the first call to pred that returns false, after which no further calls are made. If the dictionary type is inherently ordered, associations are processed in the inherent order; otherwise in an arbitrary order. If all calls return true, dict-any returns the value of the last call, or #t if no calls are made.

(define (some-even? k v) (or (even? k) (even? v)))
(dict-every some-even? '((2 . 3) (3 . 4))) => #t
(dict-every some-even? '((1 . 3) (3 . 4))) => #f

(dict-keys dtd dictionary)

Returns a list of the keys of dictionary. If the dictionary type is inherently ordered, associations are processed in the inherent order; otherwise in an arbitrary order. The order may change when new elements are added to dictionary.

(dict-keys aed dict) => (1 3 5)

(dict-values dtd dictionary)

Returns a list of the values of dictionary. The results returned by dict-keys and dict-values are not necessarily ordered consistently.

(dict-values aed dict) => (2 4 6)

(dict-entries dtd dictionary)

Returns two values, the results of calling dict-keys and the result of calling dict-values on dictionary.

(dict-entries aed dict) => ; 2 values
  (1 3 5)
  (2 4 6)

(dict-fold dtd proc knil dictionary)

Invokes proc on each association of dictionary with three arguments: the key of the association, the value of the association, and an accumulated result of the previous invocation. For the first invocation, knil is used as the third argument. Returns the result of the last invocation, or knil if there was no invocation. Note that there is no guarantee of a consistent result if the dictionary does not have an inherent order.

(dict-fold + 0 '((1 . 2) (3 . 4))) => 10

(dict-map->list dtd proc dictionary)

Returns a list of values that result from invoking proc on the keys and corresponding values of dictionary.

(dict-map->list (lambda (k v) v) dict) =>gt; (2 4 6),
(dict-map->list - aed dict) => (-1 -1 -1) ; subtract value from key

(dict->alist dtd dictionary)

Returns an alist whose keys and values are the keys and values of dictionary.

; plist to alist
(dict->alist '(1 2 3 4 5 6)) => ((1 . 2) (3 . 4) (5 . 6))

Add dict-map, dict-filter, dict-remove, and dict-search.

(dict-comparator dtd dictionary)

Return a comparator representing the type predicate, equality predicate, ordering predicate, and hash function of dict. The last two may be #f if the dictionary does not make use of these functions.

Dictionary type descriptors

(dtd? obj)

Returns #t if obj is a DTD, and #f otherwise.

(make-dtd arg)
(dtd (procindex proc) …)     [syntax]

Returns a new dictionary type providing procedures that allow manipulation of dictionaries of that type. The args are alternately procindex names and corresponding procs.

A procindex argument is the value of a variable whose name is the same as a procedure (except those in this section and the Exceptions section) suffixed with -index and a proc argument is the specific procedure implementing it for this type. These procedures only need to handle a full-length argument list (except when defining dict-ref and dict-update!), as the other defaults have already been supplied by the framework.

FIXME: Is the last sentence still true?

Arguments for the six procedures make-dictionary, dictionary?, dict-size, dict-search, dict-search!, and dict-for-each are needed to make a fully functional DTD, but it is not an error to omit them. The others are optional, but if provided can be more efficient than the versions automatically provided by the implementation of this SRFI.

The dtd macro behaves like a wrapper around make-dtd, but may also verify that the procindex names are valid, that there are no duplicates, etc.

The following examples are from the file plist-impl.scm in the sample implementation; the procedures referred to are also in that file.

  (make-dtd
    make-dictionary-index '()
    dictionary?-index plist?
    dict-search-index plist-search
    dict-search!-index plist-search!
    dict-size-index plist-size
    dict-for-each-index plist-foreach
    dict->alist-index plist->alist)) => a DTD for plists

  (dtd
    (make-dictionary-index '())
    (dictionary?-index plist?)
    (dict-search-index plist-search)
    (dict-search!-index plist-search!)
    (dict-size-index plist-size)
    (dict-for-each-index plist-foreach)
    (dict->alist-index plist->alist)) => a DTD for plists

(make-modified-dtd dtd obj ...)

Returns a DTD that is equivalent to dtd except that the alternating procindexes and procs are used to replace the corresponding entries in dtd. Caution should be used when replacing any procedure other than the six listed in the definition of make-dtd.

A common use of this is to replace the implementation of make-dictionary with one that provides specific arguments to the underlying dictionary-type-specific constructor. (make-hash-table, e.g.)

(make-modified-dtd hash-table-dtd
  make-dictionary-index
    (lambda (dtd comparator)
      (make-hash-table comparator 'weak-keys))) =>
  a DTD for weak hash tables

(make-alist-dtd equal)

Returns a DTD for manipulating an alist using the equality predicate equal.

(make-alist-dtd =) => a DTD for alists using numeric equality

(dtd-ref dtd procindex)

Returns the procedure designated by procindex from dtd. This allows the ability to call a particular DTD procedure more efficiently multiple times.

Exceptions

(dictionary-error message irritant ... )

Returns a dictionary error with the given message (a string) and irritants (any objects). If a particular procedure in a DTD cannot be implemented, it instead should signal an appropriate dictionary exception that can be reliably caught.

(dictionary-error? obj)

Returns #t if obj is a dictionary error and #f otherwise.

(dictionary-message dictionary-error)

Returns the message associated with dictionary-error.

(dictionary-irritants dictionary-error)

Returns a list of the irritants associated with dictionary-error.

Variables

The following procindex variables are exported from this DTD: make-dictionary-index, dict-unfold, dictionary?-index, dict-empty?-index, dict-contains?-index, dict-ref-index, dict-ref/default-index, dict-set-index, dict-adjoin-index, dict-adjoin!-index, dict-delete-index, dict-delete!-index, dict-delete-all-index, dict-delete-all!-index, dict-replace-index, dict-replace!-index, dict-update-index, dict-update!-index, dict-update/default-index, dict-update/default!-index, dict-pop-index, dict-pop!-index, dict-map-index, dict-map!-index, dict-filter-index, dict-filter!-index, dict-remove-index, dict-remove!-index, dict-search-index, dict-search!-index, dict-copy-index, dict-size-index, dict-count-index, dict-any-index, dict-every-index, dict-keys-index, dict-values-index, dict-entries-index, dict-fold-index, dict-map->list-index, dict-dict->alist-index, dict-comparator-index.

The following DTDs are also exported from this SRFI: srfi-69-dtd, hash-table-dtd, srfi-126-dtd, mapping-dtd, hash-mapping-dtd, plist-dtd, alist-eqv-dtd, and alist-equal-dtd. The last two provide DTDs for alists using eqv? and equal? respectively.

Implementation

The sample implementation is found in the GitHub repository.

The following list of dependencies is designed to ease registering new dictionary types that may not have complete dictionary APIs: