On the development of my own Lisp interpreter, part 2: lambdas, fexprs and closures.

My thinking has been to make the keyword ‘lambda’ (and ‘nlambda’) represent entry-points to a byte-code compiler: the idea is that if we see an expression like:

(lambda (x y) (+ x y))

This function invokes the compiler and returns a procedure symbol which represents the bytecode that adds x and y together and returns a value. So internally, somewhere in my system, this would return a data structure like:

Procedure Definition

Also recall that my reason for using an ‘nlambda’ construct was to provide me a way to define fexprs, so I could easily write certain primitives without having to build those primitives into the interpreter. The idea was that if I could create an ‘nlambda’ declaration, then certain primitives could easily be written as fexprs, rather than having to complicate my interpreter or compiler. Thus:

(define quote (nlambda (x) x))

(define begin (nlambda ins) 
   (cond ((null? ins) '())
         (#t (eval (car ins))
             (begin (cdr ins)))))

The first is the definition of the ‘quote’ instruction, the second defines ‘begin’, a method for evaluating a sequence of instructions in a list of instructions.

Also recall that I decided that I wanted a lexical Lisp rather than a dynamically bound Lisp, for reasons of given in the previous post.

And thus my problem with closures arises.

Let me spell it out.

Suppose we assume that the binding “closure” of our function is the stuff within the function itself: that is, when we compile our procedure above, the scope of the accessible variables is either (a) the variables declared within the procedure (defined by the lambda function), or (b) the global variables in our system.

This makes sense, since it means if we write:

(define add (lambda (x y) (+ x y gz)))  ;; gz is global

Then we execute add:

> (define gz 3)
> (add 1 2)

Which makes sense since 1 + 2 + 3 = 6.

It also means if we use our ‘add’ routine in another function:

(define foo (lambda (gz) (add gz 2)))

It’s pretty clear our intent is that ‘gz’ within ‘foo’ is a local variable, and if we call ‘foo’ with 1:

> (foo 1)

This behavior makes intuitive sense because while the variable ‘gz’ in ‘foo’ hides the global variable ‘gz’, it shouldn’t change the intent of ‘add’ simply because I called it from within another function which hides a global variable. Meaning the intent of ‘add’ was always to use the global variable ‘gz’, and we should expect 6 from ‘(foo 1)’, not 4 (1 + 2 + 1), which we would have gotten in, say, a dynamically bound Lisp environment.


If we also assume that the lexical binding rules apply to ‘nlambda’, then what happens if we define a function ‘setq’:

(define setq (nlambda (a b) (set a (eval b))))

And suppose we call ‘setq’ from another function:

(define bar (nlambda (q) (setq gz q)))

Now think for a moment: nlambda is a function call with local variable scope. This means that when we get to the statement ‘(set a (eval b))’, well, ‘a’ evaluates to the variable ‘gz’. And ‘b’ evaluates to the variable ‘q’.

And so we call (eval q).

But remember: nlambda is a function in a lexical language: calling it pushed the variable context. So when we call ‘eval’, here is what our stack looks like:

Stack Representation

And because we’re a lexical version of Lisp, our evaluator only looks at the head of the stack, and at the global variable list. Meaning our variable ‘q’ is not in scope.

So instead of having the intended effect, our setq will fail: it is unable to evaluate ‘q’.

It’s worse than that, however. Suppose we define our closure for all lambda functions based on the environment in which they are encountered. In other words, if we define (for example) the list of available variables by running our interpreter so that all new variables defined by ‘lambda’, ‘let’ and other constructs push onto a local list of available functions, so (for example) writing:

(let ((x 1)) (lambda (y) (+ x y)))

This operation tracks all variable being defined by lambda on a single local variable environment stack, and the lexical closure of lambda includes both the variables x and y, this still does not solve my problem, and that is the lexical closure of my ‘setq’ function does not actually involve q, so (eval b) will fail every time.

This problem does not occur in the original dynamically bound Lisp language invented in the 60’s simply because there was no closure. The entire stack of variables was searched in order to find the referring variable. Of course this makes variable resolution a non-trivial problem, because every time you access a variable you have to scan the entire state space for variables. You can’t just compile a variable name as a fixed offset within an array of variables and make variable access linear time.

At this point I think I need to make a few decisions.

(1) It’s pretty clear I need to build a full interpreter that is able to execute the language as well as a compiler within the lambda function.

(2) I’m unclear if I want to deal with closures right now. I suspect the correct answer is to bite the bullet and deal with closures.

(3) I’m going to have to bite the bullet and include a whole bunch of fundamentals (such as ‘let’, ‘setq’ or ‘set!’, ‘do’, ‘loop’, ‘prog’ or whatever other looping constructs that define local variable state) within my compiler, rather than defining them as fexprs. Fexprs don’t seem to work well for the purpose I want them for.

(4) Rethink my fexpr verses macro decision. Fortunately I can drop nlambda from my prototype, build in my primitives, and worry about macros later.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s