Declarative Languages
Lecture #8

Purpose: Equality, hash-tables and blocks.

8.1 Introduction: equality

We encountered earlier various predicates for comparing specific types of lisp object:

• = for numbers (compares any number of objects)
• char= for characters (compares any number of objects)
• string= and string-equal for strings (compares precisely two objects)
and one predicate
• equal (compares precisely two objects)
for comparing general lisp objects. I have assured you that if two objects print the same then they are equal. Now let's get closer to the truth, by introducing two further general equality predicates (each taking precisely two arguments): the functions eq and eql.

8.2 Equality and eq

Two objects are eq only if they are in fact the same object. Quite how this works depends on their types. Obviously, if the objects have different types they cannot be the same object and so they cannot be eq.

• If two symbols print the same then they are defined to be identical and hence they are guaranteed to be eq:
• (mapcar 'eq '(foo t nil :wibble) '(foo t nil :wibble))  =>  (t t t t)
• If two integers have the same value and are fixnum then they are eq. To be fixnum, an integer has to be between the constants most-negative-fixnum and most-positive-fixnum inclusive. The values of these two numbers is implementation-dependent.
• In LispWorks for Windows, these values are -223 and +223-1. Most implementations these days go higher than that in the fixnum range (typically to 228 or 229).
• In theory, you cannot guarantee that typing in the same character twice will result in two eq objects: eg
• (eq #\Space #\Space)
may turn out to be false. In practice however no implementation worth its salt will harass you like this and for LispWorks the above expression is true.
• With any other objects, looking the same is no guarantee of eq. Typing in two identical looking bignums (that's an integer which is not a fixnum), or floats, or lists, or strings, or vectors, will result in numbers which are =, or in sequences which have the same members, but it has generated different structures with different addresses in memory and these will not be eq.
• (eq 1.0 1.0)  =>  nil
(eq 8388608 8388608)  =>  nil  ; but may be t in other implementations
(eq '(a b c) '(a b c))  =>  nil
(eq "this takes some thought" "this takes some thought")  =>  nil
If you have two pointers to the same thing then they will be eq. For example, (eq something something) is true no matter what value something has. The following function too will always return true (for any argument):
(defun always-true (thing)
(let* ((my-list (list thing)))
(eq thing (first my-list))))
Note that if a function generates new objects, then these cannot be eq to each other:
CL-USER 21 > (let* ((things nil))
(dotimes (i 2)
(push '(t) things))    ; pushing the same object each time
(eq (first things)
(second things)))
T

CL-USER 22 > (let* ((things nil))
(dotimes (i 2)           ; pushing s new object each time
(push (list t) things))
(eq (first things)
(second things)))
NIL

CL-USER 23 >

This includes all functions (e.g. copy-list) which are defined as returning a fresh copy of some object, for example:

(let* ((foo '(1 2 3 4))) (equal foo (butlast foo 0)))  =>  t
(let* ((foo '(1 2 3 4))) (eq foo (butlast foo 0)))     =>  nil

8.3 Equality and eql

Two objects are eql if

• they are eq or
• they are both numbers of the same type and the same value or
• they are both characters that represent the same character (as noted above, this distinction is not worth bothering with and you can in practice assume that two identical looking characters are eql simply because they are eq).
A large number of lisp functions use a predicate for comparing objects; this tends to be specified as an optional argument and the default value is typically eql (see section 17.2.1 of the HyperSpec). As an example, consider the function position which takes an object and a sequence, and returns the first index into the sequence at which the object was found (or nil if it was not found):
(position 'wibble '(foo bar wibble baz wombat))  =>  2
The objects are compared by eql, unless another predicate is handed in as the value to the :test argument
(position "wibble" '("foo" "bar" "wibble" "baz" "wombat"))
=>  nil
(position "wibble" '("foo" "bar" "wibble" "baz" "wombat") :test 'string=)
=>  2
Related to position is the function position-if which takes a predicate (of one argument) and a sequence:
(position-if (lambda (x) (and (numberp x) (plusp x) (evenp x)))
'(digits of pi are 3 1 4 1 5 9))
=>  6
and related to both of these are find and find-if, which return the item which was found rather than its position.
(let ((bits '("foo" "bar" "wibble" "baz" "wombat")))
(eq (third bits)
"wibble"))  =>  nil

(let ((bits '("foo" "bar" "wibble" "baz" "wombat")))
(eq (third bits)
(find "wibble" bits :test 'string=)))  =>  t

8.4 Revisiting equal

Two objects are equal if

• they are eql or
• they are strings, of the same length, which match (by eql) character for character or
• they are both of type cons and
1. the two cars are equal and
2. the two cdrs are equal
• (there are a couple of further conditions under which this predicate is true but we haven't met the objects they apply to yet)
8.5 Hash-tables (an excuse for knowing about eql)

We know about the following general (in the sense that they can contain any lisp objects) data structures:

• cons: used for building lists and trees; good for flexibility as you can add or remove cells very easily; slow access over long sequence
• vector: one-dimensional array; inflexible but fast access no matter how large the sequence
• structure: user-defined type; behaves like a vector whose fields are named rather than numbered
We now introduce the hash-table.  This is a data structure whose indices may be general lisp objects, which offers flexibility similar to lists and which delivers lookup times intermediate between lists and vectors.

 Name Index by Flexibility Data ordered into sequence? Speed Use cons first and rest good yes slow access over long sequence building lists and binary trees vector numerical index poor yes fast, independent of length of sequence random lookup and rapid traversal of large data sets structure field name (not available at run-time) poor no fast, independent of number of fields user-defined types hash-table any lisp object good no intermediate dictionaries, general object maps

If we weren't bothered about lookup times, we could implement something like this with lists:

CL-USER 11 > (defun get-from-list (index list)
(dolist (list-member list)
(let* ((maybe-index (first list-member))
(maybe-value (second list-member)))
(if (equal index maybe-index)
(return maybe-value)))))
GET-FROM-LIST

CL-USER 12 > (defparameter phone-numbers
'(("Nick" 2330) ("Martin R" 2356) ("Bob" 2342)))
PHONE-NUMBERS

CL-USER 13 > (get-from-list "Bob" phone-numbers)
2342

CL-USER 14 > (get-from-list "Ethel the Aardvark" phone-numbers)
NIL

CL-USER 15 >

(and with more, slightly nastier code to add, reset and remove the phone numbers). Using hash-tables hides the above nastiness and is reasonably fast even when it gets large.

To make a hash-table, call the function make-hash-table. To look values up in the table use the function gethash (setfable). To remove a single entry altogether use remhash, and to empty a hash-table completely call clrhash. For example:

CL-USER 9 > (defparameter *table* (make-hash-table :test 'equal))
*TABLE*

CL-USER 10 > (dolist (pair '(("Nick" 2330) ("Martin R" 2356) ("Bob" 2342)))
(let* ((key (first pair))
(value (second pair)))
(setf (gethash key *table*) value)))
NIL

CL-USER 11 > (gethash "Bob" *table*)
2342
T

CL-USER 12 > (gethash "Ethel the Aardvark" *table*)
NIL
NIL

CL-USER 13 >

Notes:
• make-hash-table takes a keyword argument :test which determines how keys (i.e. the indices) will be compared. You do not have an open choice of any predicate here: you are limited to eq eql equal and equalp (look this last one up if you feel a burning urge to do so). The default test is, as ever, eql.
• gethash is like nth (and unlike aref): it takes the key as first argument and the table comes second.
• gethash returns two values (like read-line did): the second value tells you whether anything was found or not. This allows you to distinguish between finding nil and not finding anything (in both cases, the primary return value is nil).
• (setf gethash) can be used both to add new values to the table (as in the above example) and to reset existing values.
Once you've built a hash-table, a useful function for traversing it is maphash, which takes a function and hash-table as arguments. The function is invoked repeatedly for each entry in the table, with two arguments (a key and the corresponding value). For example:
CL-USER 13 > (maphash (lambda (name number)
(format t "~&~a is on extension ~a"
name number))
*table*)
Nick is on extension 2330
Martin R is on extension 2356
Bob is on extension 2342
NIL

CL-USER 14 >

Note:
• the order in which the entries are processed by maphash is implementation defined and may even not be the same twice running.
• maphash always returns nil.
8.6 Blocks

We have met the macro return which allows "premature" exit from the various looping macros (dotimes dolist loop etc). A generalization of this is the special operator return-from which in particular allows early exit from any (named) function.

CL-USER 14 > (defun one-value (table)
(maphash (lambda (key value)
(declare (ignore key))
(return-from one-value
value))
table))
ONE-VALUE

CL-USER 15 > (one-value *table*)
2330

CL-USER 16 >

The above (admittedly somewhat pointless) function returns one value extracted from the hash-table supplied as its argument.
• When we're within the body of a (named) function, we are said to also be inside a block with the same name. So while we are in the body of one-value, we are inside a block named one-value.
• Within a block, you can leave at any time with the special operator return-from. It takes two arguments: the first (not evaluated) is the name of the block you want to leave, the second (evaluated; optional and defaulting to nil) is the value to return.
• The looping macros (dotimes dolist loop etc) establish a block named nil, so you could exit them by calling (return-from nil). The macro return is shorthand for this.
• The above examples of blocks (established by defun or by the looping macros) are said to be implicit, because they are created behind your back. You can establish blocks of your own, at any point in your code, using the special operator block (look it up in the HyperSpec).
• The special operator return-from is said to be lexical in scope - it only works within the textual confines of the block it refers to.
Also in the above code you should be aware of the following:
• In the lambda form in one-value, two arguments have to be supplied (because that's how maphash works) but only one is actually wanted (or used). The (declare (ignore ...)) form is included immediately after the parameter list to prevent compiler warning along the lines of
• ;;;*** Warning in ONE-VALUE: KEY is bound but not referenced
Declarations can appear after function parameter lists, after the bindings in let*, and in many other macros and special operators - see figure 3-23 in the HyperSpec for the full list.
8.7 Practical session / Suggested activity

Convert last week's work to store student records in a hash-table (accessible by name) rather than in a list. Write functions to add a new student, to find the record of a student with a given name, and to delete a student.

As before, write functions to name the three students who have the highest marks, or to spot which lecturer fails most of their students.

Use return-from in a function to return the SID of any student who hasn't attmpted any modules at all.

Comment on which data stucture was "best". [Define "best".]

• Look up the definitions of the functions eq, eql and equal in the HyperSpec. Try out many examples. Make really sure you appreciate the difference between them.
• Either: justify carefully the statement that "if two objects print the same then they are equal." or give a simple counter-example. Consider vectors.
• Define your own version of equal (call it my-equal) in terms of eql; define eql in terms of eq, = and char=.
• Actually, position takes more keyword arguments than the one (:test) given above. Look them up and try them out.
• To see multiple-values taken to a mild excess, call the function (get-decoded-time).  If you can't figure out what the result means, try calling it once or twice more (and then, if necessary, either ask me or look it up). Read section 5.5 of Graham and implement a function which prints today's date, or the time, or both. (You might attempt to emulate the format of UNIX date(1), eg "Fri Sep 15 14:12:41 BST 2000".)
• Write a function of two arguments which simply returns its first argument. Defining your function should not cause any warnings to be signalled.
• Redefine function position-three (section 5.4) to use return-from rather than return. Do this twice: first returning from a block named nil, then returning instead from a block named position-three. Does either of these new functions give you anything which the original didn't?
• Define a function which doubles every member of a list. If any member of that list is not a number, simply return nil from your function.
• Suppose by some ghastly misunderstanding there was a lisp where the implementers had forgotten to include the type vector or any operations based on it.

• See if you can use hash-tables to plug the gap, implementing enough of the basics (my-make-arraymy-length, my-aref and my-setf-aref at the very least) to prove that the concept works. Make sure that my-aref checks that the index is within bounds (you can store the upper-bound in the hash-table).
• Jon L White (jonl@ptolemy.arc.nasa.gov) writes:

• >
> I showed up at MIT around the summer of 1966 (as a cross-registered
> graduate student from Harvard) and the FOO, of FOO and BAR, was generally
> recognized then as a variant, and a softening, of the oft-used phrases
> from the American Military organizations "Situation Normal: All F***ed Up",
> for "Snafu"  and "F***ed Up Beyond All Recognition" for "Fubar"
>
> No one has---to my knowledge---verified this as an accurate origin; merely
> that the story, as told, has it roots probably prior to the Vietnam War,
> and maybe even going back before the second World War.  If it is _true_,
> then one must wonder where the ancient "Phooey" came from?  Maybe the
> above explanations are merely the first of the rounds of urban legends.
>