Interesting stories about Emacs and Lisp

I saw that lispy wrote about this. I happened to spot the original speech by Richard Stallman on reddit. The title intrigued me: “My Lisp Experiences and the Development of Emacs”. I’ll go through some pieces of it, because there are some interesting stories in here.

My first experience with Lisp was when I read the Lisp 1.5 manual in high school. That’s when I had my mind blown by the idea that there could be a computer language like that.

This reminded me of a quote from Alan Kay’s (AK) interview with Stuart Feldmen (SF) at ACM Queue from a few years ago, which I’ve cited a few times before:

SF If nothing else, Lisp was carefully defined in terms of Lisp.

AK Yes, that was the big revelation to me when I was in graduate school—when I finally understood that the half page of code on the bottom of page 13 of the Lisp 1.5 manual was Lisp in itself. These were “Maxwell’s Equations of Software!” This is the whole world of programming in a few lines that I can put my hand over.

I realized that anytime I want to know what I’m doing, I can just write down the kernel of this thing in a half page and it’s not going to lose any power. In fact, it’s going to gain power by being able to reenter itself much more readily than most systems done the other way can possibly do.

All of these ideas could be part of both software engineering and computer science, but I fear—as far as I can tell—that most undergraduate degrees in computer science these days are basically Java vocational training.

Recently I watched a speech by Ian Piumarta on his COLAs (Combined Object-Lambda Architectures) computing system. He is working with Alan Kay at Viewpoints Research Institute on the project of building a complete end-user system within 20KLOC. There’s another article about it here. At the start of his talk he said that in order for the audience to understand what he was presenting, they needed to read the Lisp 1.5 manual. It’s sounding like the bible of symbolic computing.

It’s a system model that is not based on the traditional machine language 3-address code (opcode, operand, operand) way of doing things, but rather on symbol manipulation (function, parameters), where “parameters” is made up of atoms and/or lists. It’s a system that is not based on module binding (early- or late-bound), but late binding to objects (Lisp has a notion of objects). This is something I’ve been seeking to understand, so I began reading this book recently. You can find the illustrious Lisp 1.5 manual here (PDF).

Stallman goes on to talk about how he created Emacs. It was originally written in assembly language for the DEC PDP-10. From the beginning people had the ability to extend Emacs while they were using it. Originally this ability was provided through a command language that was designed for the TECO editor. He said it “was an extremely ugly programming language”. I’ve never seen it, but from what I’ve read it had a lot of features and its command language was extremely cryptic. One typing mistake could hose hours of work. I remember reading a joke saying that programmers would sometimes try typing their name into the command prompt for TECO to see what it would do. Despite this, people added extensions to the language to make it more powerful. Stallman saw it as unsustainable. A solution had to be found, and in the process they discovered a useful lesson about teaching programming:

The obvious lesson was that a language like TECO, which wasn’t designed to be a programming language, was the wrong way to go. The language that you build your extensions on shouldn’t be thought of as a programming language in afterthought; it should be designed as a programming language. In fact, we discovered that the best programming language for that purpose was Lisp.

It was Bernie Greenberg, who discovered that it was. He wrote a version of Emacs in Multics MacLisp, and he wrote his commands in MacLisp in a straightforward fashion. The editor itself was written entirely in Lisp. Multics Emacs proved to be a great success — programming new editing commands was so convenient that even the secretaries in his office started learning how to use it. They used a manual someone had written which showed how to extend Emacs, but didn’t say it was [programming]. So the secretaries, who believed they couldn’t do programming, weren’t scared off. They read the manual, discovered they could do useful things and they learned to program.

So Bernie saw that an application — a program that does something useful for you — which has Lisp inside it and which you could extend by rewriting the Lisp programs, is actually a very good way for people to learn programming. It gives them a chance to write small programs that are useful for them, which in most arenas you can’t possibly do. They can get encouragement for their own practical use — at the stage where it’s the hardest — where they don’t believe they can program, until they get to the point where they are programmers.

Dr. Caitlin Keller, one of Dr. Randy Pausch’s former students suggested using Alice in conjunction with a course focus on storytelling, rather than programming, to “fool” students into learning programming–teach students programming while they’re doing something else. It’s apparently worked before. Just don’t use that dreaded “P” word at the start.

Another thing to notice is that Greenberg used Lisp for a non-AI project. Lisp is a real general purpose programming system. It always has been. I think one of the injustices that’s occurred in computer science is CS professors have categorized Lisp exclusively as an artificial intelligence language. When I took CS in college we worked with Lisp for a few weeks, and that was it. Courses in AI were part of an optional curriculum, and students would get further exposure to it there. If students didn’t take them, most of them spent the rest of their CS major programming in Pascal and C. This was in the late ’80s and early 1990s.

Lisp has been used quite a bit for AI, but by no means is that all it’s good for. As Stallman went on to talk about further into his speech, Lisp is capable of implementing a fully functional operating system. That’s what created the impetus for a few people at the MIT AI lab to found two companies: Lisp Machines Inc., and Symbolics. Both produced systems that ran Lisp down at the hardware level. Both are now defunct. Symbolics managed to make it to 1995 before it died out.

I thought an interesting story that came into play was James Gosling’s involvement with Emacs.  Gosling had developed “GosMacs” (also known as “gmacs”) which was written in C, and was the first version of Emacs to run on Unix, but:

I discovered that Gosling’s Emacs did not have a real Lisp. It had a programming language that was known as ‘mocklisp’, which looks syntactically like Lisp, but didn’t have the data structures of Lisp. So programs were not data, and vital elements of Lisp were missing. Its data structures were strings, numbers and a few other specialized things. [my emphasis]

I concluded I couldn’t use it and had to replace it all, the first step of which was to write an actual Lisp interpreter. I gradually adapted every part of the editor based on real Lisp data structures, rather than ad hoc data structures, making the data structures of the internals of the editor exposable and manipulable by the user’s Lisp programs.

Stallman doesn’t say whether Gosling wrote mocklisp, but I infer from his story that he probably did. I think it explains a lot about why Java turned out the way it did.

Stallman said he thought about creating a GNU Lisp operating system, but he decided against it because it required specially programmed processors to run efficiently. Instead he’d focus on creating a Unix-like OS as part of the GNU project. This inspired him to create GNU Emacs, based on GosMacs.

Around 1995, due to actions taken by Sun Microsystems, Stallman decided to create Guile, a version of Scheme, as an extension language for all GNU projects. I like his vision for it, because he describes what I think is sad about the current state of affairs:

Our idea was that if each extensible application supported Scheme, you could write an implementation of TCL or Python or Perl in Scheme that translates that program into Scheme. Then you could load that into any application and customize it in your favorite language and it would work with other customizations as well.

As long as the extensibility languages are weak, the users have to use only the language you provided them. Which means that people who love any given language have to compete for the choice of the developers of applications — saying “Please, application developer, put my language into your application, not his language.” Then the users get no choices at all — whichever application they’re using comes with one language and they’re stuck with [that language]. But when you have a powerful language that can implement others by translating into it, then you give the user a choice of language and we don’t have to have a language war anymore. That’s what we’re hoping ‘Guile’, our scheme interpreter, will do.

Amen to that! I really do wish that we didn’t have to make choices about which language we’re going to use for a project, based on what VM architecture someone has already chosen. This is gradually being resolved on the popular VMs, but there are still literal speedbumps that have to be overcome, because the VMs weren’t designed for dynamic languages. Microsoft has at least started to address this issue.

It’s interesting to learn that Stallman has long had an affinity for Lisp. He strikes me as a pragmatist, since from my experience a lot of GNU software is written in C. Computers are clearly powerful enough now that every GNU project could be written in Lisp or Guile, but I assume there are other considerations. C is still a popular language, and it would disenfranchise most programmers if all the projects were rewritten in these more powerful languages. Plus, some people are still running slow hardware, and the GNU project wouldn’t want to penalize them.

I’ve said this before, but I think this is where computer science education is getting off on the wrong track. Python is being taught in many CS programs in certain classes, but powerful, dynamic languages are still relegated to backwater status. C++ and Java are the main languages taught now. I don’t think computing is going to rise to the next level, so to speak, until it’s recognized that we need more powerful programming metaphors and representations, and we need greater flexibility in our development systems.

The reason this is not generally recognized is that computing is largely a tool-using culture. As Alan Kay said once in one of his “The Computer Revolution Hasn’t Happened Yet” speeches (though he may have been quoting someone else), most people are instrumental reasoners, meaning they only find a tool or idea interesting if it meets a need they have right in front of them. Only a minority of people tend to be interested in ideas for their own sake.

So what might bring about this next step is the issue of programming for multiple cores, and parallel processing. The traditional languages don’t deal with this very well, requiring programmers to manually control threads in code. There’s been talk that dynamic languages might provide the best answer for this. So that might provide the incentive to advance.

—Mark Miller,

9 thoughts on “Interesting stories about Emacs and Lisp

  1. Very concise and well reasoned article. I agree with your conclusion that harnessing the power of multiple processes will lead us to a new computer revolution. Whether or not that revolution comes from more sophisticated software / hardware that handles distributed processing, or by integrating current platforms through translation, opening up the possibility of greater participation, possibly at lessor skill levels, is yet to be seen. Much of the work that has been done, SOA, BigTable, BitTorrent … are leading the way but until we find a common language and new and important real problems to solve, that can not be solved using existing technology, progress will continue to be slow. We may be in a transition time, but I suspect that this time will be quite brief.

    Thanks for the article!
    – Ron Teitelbaum

  2. What does Kay mean by, “In fact, it’s going to gain power by being able to reenter itself much more readily than most systems done the other way can possibly do.”

  3. Mark –

    You saved the best for last, with: “So what might bring about this next step is the issue of programming for multiple cores, and parallel processing. The traditional languages don’t deal with this very well, requiring programmers to manually control threads in code. There’s been talk that dynamic languages might provide the best answer for this. So that might provide the incentive to advance.”

    I am convinced as I wrap up a project for TechRepublic (photomosiacs) that beyond “crummy” multithreading is nearly impossible in the current mainstream programming paradigms. I keep reading (and occassionally writing) that our compilers need to get smarter. But we all know that it is impossible for the compilers to get much smarter. Why? Because OO and procedural language express precisely zero *intention*, and only minutia. Without understanding *intention* how can the compiler do more than some simple tricks with loops and the like?

    I am not 100% sure if Lisp itself is the answer to this, but Lisp (and other functional languages) are at least intended to focus more on intention, not instruction.

    I think dynamic languages, in and of themselves, are a red herring. The difference between a dynamic language and a static one is merely the level of assumptions that the system takes. Perl appears to understand “intention” simply because the people who wrote it understood how programmers think, and had the system make assumptions for quite a good many things. Try writing a Perl program that takes advantage of Perl’s “Do what I meant, not what I wrote” philosophy with a twist: everywhere you would see a newline marker used for things (signalling the end of input at a prompt, record seperators, paragraph deliniation, etc.) with any other character (or combination of characters), and Perl falls apart; it can be partially saved by setting the appropriate globals at the beginning. For all of Perl’s design, it simply holds the illusion of knowing your *intention*, based on the pervasiveness of certain things in Programming, particularly on the *Nix platform.


  4. Lispy –

    It means that Lisp can be written in Lisp. I glanced over page 13 of that manual through the provided link; while not being a Lisp expert, it looks like the entire language has been expressed in Lisp itself; the rest of the details, like providing functionality for operators, is a fairly trivial task after that. In high school I went through this with Scheme. The stripped version we used (EdScheme) had 13 or 17 functions and operators built into it… the elevator or thermostat principle to the extreme. In that state, it was hardly usable (addition, but no multiplication, for example) but it was enough to bootstrap the rest of the language off of!

    In a more metaphorical sense, what AK is talking about is that any given Lisp program can redefine the Lisp language itself. For example (forgive my bad Lisp here):

    def * (x y): (
    + x y

    would allow you to rewrite the “*” operator to mean “add” instead of multiply. In fact, you could redefine the “def” operator (which declares functions) to mean “print “hello world” to the console and exit the program), rendering the rest of the program from that point forward unable to define new functions.

    That is *true power*. This is why Lisp can be mutated to any language out there, like in the discussion above about using Lisp (or it’s dialect Scheme, in the disguise of Guile) as the extensibility language for everything; once it is in Lisp, that very fact means that something that looks like C or Perl or APL or Prolog or Java can be run, with enough work. In this sense, Lisp functions as a metalanguage, a language for defining other languages.

    And that is what makes it (and others similar to it, or variants of it) so darn powerful. It is a fundamental difference in the paradigms.


  5. @Justin:

    Alan Kay mentioned Perl in the ACM interview I cited:

    “I think a lot of the success of various programming languages is expeditious gap-filling. Perl is another example of filling a tiny, short-term need, and then being a real problem in the longer term. Basically, a lot of the problems that computing has had in the last 25 years comes from systems where the designers were trying to fix some short-term thing and didn’t think about whether the idea would scale if it were adopted.”

    I agree with his general statement on the success of programming languages. I’ve participated in this mentality myself, as many have, though perhaps not most. Most developers are probably more conservative, though I don’t think they have any greater eye for good overall computing design.

    I’ve gotten excited when I see someone come out with a language that has some new features. The problem is we tend to not see the bigger picture. For example, it was frustrating to me to see that Microsoft had added anonymous functions to .Net, but had done nothing to leverage this feature in the FCL. They added a “yield” keyword, but you had to use it inside of a class method that implemented IEnumerator, or something. The proliferation of language features, in and of itself, doesn’t serve us very well, as you’ve seen with Perl. The features in the languages were just thrown into CLR 2.0 without any sense of where they fit into a larger design. As developers we should pay more attention to the overall design than “ooh” and “ahh” at language features. Language features are only really powerful when they are compatible with a design rationale. I refer you back to your experience with threads.

    You make a good point that the term “dynamic language” doesn’t do justice to the subject, because one could create a weak dynamic language that doesn’t help with threading. Some have called VB a “dynamic language”, for example. What I was shooting for with the term was languages like Lisp, Scheme (two of the languages I discussed), Haskell, and perhaps Python. Ruby and Smalltalk provide decent solutions, though I don’t know if they’re of the same caliber in this area.


    I think what Justin is talking about is part of the answer, though I would call it “Lisp being written in Lisp”, rather than being able to “re-enter itself”, in the same way that you can talk about “writing C in C” (C compilers are written in C all the time). Being able to redefine the language is a part of Lisp’s power, but I don’t think that hits the target either for Kay’s statement.

    At a basic level what Kay is probably talking about is the ability to use an eval function within the code. This effectively causes Lisp to “re-enter itself” using its parser with a dynamically generated expression. Before this part in the interview he was discussing Java. So that’s probably “the other way” he was talking about. Java doesn’t have this ability to “re-enter itself” in the same expressive way that Lisp does. You can do dynamic stuff in it, like create a class, but it’s a real chore. Smalltalk, for example, makes the dynamic creation of classes MUCH easier.

  6. Pingback: Programming and Development mobile edition

  7. RE: “If C# is so much more efficient than VB.NET, why do all of the code samples appear to be the same length and look virtually identical?”

    Because they were written at the same time by the same technical writer to demonstrate the same concept. Most Microsoft programmer/writers create a code example in one language and then either use a translator or VS to replicate the code in the other. I do this all the time. I typically create the sample in C# as I am most familiar with Java. Then I create a VB project and recreate the bugger line-by-line.

    I’m not sure where you got the idea that C#/.NET is more efficient than VB.NET. I certainly have never claimed that. Could you point out the Microsoft Web page where this is asserted?

  8. Doug –

    “I’m not sure where you got the idea that C#/.NET is more efficient than VB.NET. I certainly have never claimed that. Could you point out the Microsoft Web page where this is asserted?”

    This idea came not from official Microsoft literature, but from “common sense” and such. Heck, I’m guilty of it too. The thinking is, C# is in the C syntax style, “which we all know” is much more “efficient” than the VB.Net syntax.

    Your explanation makes perfect sense, and it highlights the danger of learning a language through the standard reference docs. What you end up with is C# written like VB.Net, but with curly braces replacing “then” and “next” expressions. So someone learning C# through examples is really learning VB.Net (since syntactically, it is the lowest common denominator of the two) with a different syntax, instead of the “C# Way of Doing Things”.

    Also, in case it is helpful, you responded to the “TalkBack” of my post at TR, I did not actively post that comment here, if that helps give it context. 🙂


  9. Pingback: SICP: What is meant by “data”? « Tekkie

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s