Saturday 21 April 2012

Data in Lisp

Lisp has always been something of a mystery to me. I've read much about it, but still sometimes feel I don't have a good grasp of what is going on. Almost it seems that you need to have written your own Lisp compiler to understand it. Here are a few points which I found enlightening.

Lisp is, simply put, not a functional language! Understanding of state and data is indispensable in understanding the execution of a Lisp program.

We have heard before that a binding associates a symbol with a value. But it is best, I think, to say that "a binding is a storage location". (I'm not sure if such a statement would enrage somebody with a different mental model of Lisp.) For example:

(let ((x 5)) ; create a storage cell and store 5 in it
(setq x 7) ; overwrite 5 with 7
(setq x (cons 1 2)) ;allocate a new storage area with two cells. Store 1 in the first cell and 2 in the second. Store in the aforesaid storage cell ("x") the memory address of the newly allocated area (and keep some note somewhere that it is a pointer to a cons cell, not an integer)
(setq x '(3 4 5))) ;store a pointer to a location which contains a list (3 4 5). (this list should not be modified.)

All in all, the storage which is allocated when this "let" expression is evaluated is as follows:
* A storage location for the symbol "x"
* Storage for cons cell

(The list (3 4 5) was assumed to exist already.)

Accessing the values of bindings


Only one of these can actually be referenced: the cons cell. You cannot access data structures such as those which hold the bindings of symbols. However, it is best to think of them both as storage, because they are both subject to garbage collection.

One thing you might try and do is export the value of a symbol. This is impossible.

To show a couple of work-arounds, here are two ways of implementing Jensen's device.

Example from Wikipedia in Algol 60:

begin
integer i;
real procedure sum (i, lo, hi, term);
value lo, hi;
integer i, lo, hi;
real term;
comment term is passed by-name, and so is i;
begin
real temp;
temp := 0;
for i := lo step 1 until hi do
temp := temp + term;
sum := temp
end;
comment note the correspondence between the mathematical notation and the call to sum;
print (sum (i, 1, 100, 1/i))
end


(I'm not sure what Jensen's device actually is but I think it's the function "sum".)

Let's think for a minute about what we would have to do to get something like this to work. The example works by passing a variable ("i"), and a function which operates on the variable to the function (written here simply as "1/i"). The function is tied specifically to the variable i, and no other.

This makes us think of closures. We'll create a new variable (or in Lisp terms, binding), and a function which uses this variable, and then pass both to a function.

Our first attempt:

(let ((i 0)) ; storage area
(flet ((term () (/ 1 i))) ; function referring to storage area
(sum i 1 100 #'term))) ; arguments are in same order as above example


Wait, no no no, we are evaluating "i" on the last line and passing 0 to the sum function. Maybe we could quote it as "(quote i)" - no, that won't work: then we're passing a symbol to the function, which is no use.

There's no way of doing what we want to do without starting again and using a very different approach.

For example, data structures such as lists can be passed around. So we could make "i" a list of length 1:

(let ((i (list 0)))
(flet
((invi () (/ 1 (first i)))
(sum i 1 100 #'inva)))


Now the sum function can get access to the value in the list.

Here is a second, final alternative using accessor functions:

(let ((i 0))
(flet
((get-i () i)
(set-i (x) (setq i x)))
(invi () (/ 1 i)))
(sum #'get-i #'set-i 1 100 #'invi)))


Now I'll move onto another subject.

Data is richer than atoms and lists


A program or function in Lisp belongs to the smallest collection of lists which satisfies the property that if we say that the lists are of type X, then the members of each list are all one of:
  • Symbols
  • Constants (such as numbers or strings)
  • Lists of type X

or in other words, is a list which you can create if the only data types you know about other than lists are symbols or constants (known as atoms).

Now during the execution of a Lisp program, data is stored of various types. It was a realization for me to notice that the options for the type of stored data are slightly richer than the types that occur in the data structure corresponding to the program code.

For example, in the above examples we pass closures such as "#'invi" as parameters to the function "sum". When "sum" is being executed, some of its parameters have values which are of the closure type, which is nowhere to be seen in code.

No comments:

Post a Comment