Feeds:
Posts
Comments

Posts Tagged ‘SICP’

It’s been 5 years since I’ve written something on SICP–almost to the day! I recently provided some assistance to a reader of my blog on this exercise. In the process, I of course had to understand something about it. Like with Exercise 1.19, I didn’t think this one was written that clearly. It seemed like the authors assumed that the students were familiar with the Miller-Rabin test, or at least modular mathematics.

The exercise text made it sound like one could implement Miller-Rabin as a modification to Fermat’s Little Theorem (FLT). From providing my assistance, I found out that one could do it as a modification to FLT, though the implementation looked a bit convoluted. When I tried solving it myself, I felt more comfortable just ignoring the prior solutions for FLT in SICP, and completely rewriting the expmod routine. Along with that, I wrote some intermediate functions, which produced results that were ultimately passed to expmod. My own solution was iterative.

I found these two Wikipedia articles, one on modular arithmetic, the other on Miller-Rabin, to be more helpful than the SICP exercise text in understanding how to implement it. Every article I found on Miller-Rabin, doing a Google search, discussed the algorithm for doing the test. It was still a bit of a challenge to translate the algorithm to Scheme code, since they assumed an imperative style, and there is a special case in the logic.

It seemed like when it came right down to it, the main challenge was understanding how to calculate factors of n – 1 (n being the candidate tested for primality), and how to handle the special case, while still remaining within the confines of what the exercise required (to do the exponentiation, use the modulus function (remainder), and test for non-trivial roots inside of expmod). Once that was worked out, the testing for non-trivial roots turned out to be pretty simple.

If you’re looking for a list of primes to test against your solution, here are the first 1,000 known primes.

Read Full Post »

This exercise in Section 3.3.3 demonstrates a concept called “memoization,” where values that are created during the run of a function are stored, so that when requests for certain computations repeat, the previously computed values can just be retrieved, rather than redone. It’s a caching technique.

I got really frustrated trying to do this exercise, because of the way the make-table function was portrayed. The exercise text doesn’t say anything about it. The code that was given for the exercise just uses it. In the past, exercises would use code that had already been introduced and discussed in the section/chapter. There is a set of functions that implement a single-dimension table (all that would be needed for this exercise) that is discussed earlier in the section. When I brought what I thought were the appropriate functions together, what I saw was this:

(define (make-table)
   (list '*table*))

(define memo-fib
   (memoize (lambda (n)
               (cond ((= n 0) 0)
                     ((= n 1) 1)
                     (else (+ (memo-fib (- n 1))
                              (memo-fib (- n 2))))))))

(define (memoize f)
   (let ((table (make-table)))
      (lambda (x) ...

Even the two-dimensional versions of make-table that were presented initialized their local table the same way. I’m used to paying attention to details in code, and so I tried to make sense of this. Every time memo-fib got called, the table got reinitialized, wiping out everything that had been added to it! This made no sense for this exercise, since the whole idea was to store values in the table, and then retrieve them later.

By looking at what other people had done with this exercise, I realized that I needed to use “wishful thinking,” and ignore any prior implementations of make-table. Just assume that make-table initializes its table structure on the first call, and returns a reference to its table (or a reference to a dispatch function) on subsequent calls to it. Assume that it does not reinitialize its internal structure, leaving anything that was previously added intact. Once I made these assumptions, the exercise made more sense.

What I saw other people do as well was assume that the insert! and lookup functions operated in constant time, taking them out as a factor in the number of steps. I do not know what assumptions are “supposed” to be made about the insert! and lookup functions. If you are doing this exercise for a grade, take these assumptions at your own risk. To me, it made sense, but perhaps your professor has a different idea in mind for how these functions are supposed to be considered in your analysis.

Read Full Post »

I was going through exercises in Section 3.3 of SICP recently (Modeling with Mutable Data), and discovered that my version of PLTScheme (4.1.4) does not include set-car! and set-cdr! operators. It turns out the team that maintains this development environment (now called “Racket”) changed this in Version 4. Originally Scheme had mutable pairs by default. So when you used cons, it created a mutable pair, though it treated it as immutable if you used car and cdr on it. You had to use set-car! and set-cdr! to change a pair’s contents. The dev. team changed PLTScheme such that cons creates immutable pairs (and car and cdr operate on it in an immutable fashion as usual). To use mutable pairs you need to use mcons, mcar, and mcdr to do the same operations that you used cons, car, and cdr to carry out on immutable pairs. Where the SICP text says to use set-car!, and set-cdr!, to manipulate mutable pairs, you need to use set-mcar!, and set-mcdr!.

This change only applies to the issue of immutable vs. mutable pairs. The dev. team made this decision, because in their view it made Scheme more of a pure functional language. However, I noticed that the set! operator (which changes a binding) still exists and works as expected in my copy of PLTScheme.

Edit: I goofed a bit when I posted this earlier today. I said that PLTScheme users should use mcar and mcdr to carry out the same operations as set-car! and set-cdr! in the SICP text. That is not the case. People should use set-mcar! and set-mcdr! for those operations.

Read Full Post »

I’m backtracking with this post. My last SICP post talked about sections 2.2 and 2.3. I’m now going back to Section 1.2 to cover an exercise I had not done until recently. I may get into more of them. Time will tell.

One of the things Section 1.2 talks about is how to calculate Fibonacci numbers recursively and iteratively. The recursive procedure is O(Φn), and the iterative method is O(n). Another concept this section brings up over and over again is exponentiation, and this exercise prompts you to look at it in a new way: It just means repeating something n times, and the operation doesn’t necessarily have to do with multiplication.

The exercise says there’s an algorithm for calculating Fibonacci numbers in O(log n). I had a few false starts with this. The only way I was able to figure it out was to go through the text of the exercise slowly and methodically, and do what each part prompted me to do, before going on to the next sentence. I think once again my ignorance of mathematics came out, because it looks like this exercise was written by a mathematician. The author had a way of expressing things that was foreign to me, and so I had to spend some time teasing out what they meant. I’m going to break up the exercise text into “bite size” pieces to clarify what’s being communicated.

The first part I worked on was:

Recall the transformation of the state variables a and b in the fib-iter process of section 1.2.2: aa + b and ba. Call this transformation T, and observe that applying T over and over again n times, starting with 1 and 0, produces the pair Fib(n + 1) and Fib(n).

I’m not going to give you the play-by-play of this step, but I will clarify what the author is saying. Where it says, “Call this transformation T“, it means keep this association in mind:

T: aa + b and ba (In other words, when you think of T, think of the transformation aa + b and ba)

By “transformation” they mean, for example, “The new value of a (to the left of the arrow) is derived from the formula on the right side of the arrow.” The same goes for b. Where it says, “observe that applying T over and over again n times, starting with 1 and 0, produces the pair Fib(n + 1) and Fib(n),” it means calculate the values for a and b by using the transformation T and apply the results to itself over and over again, starting with the values a = 1, and b = 0. Observe that the pattern of pairs (the values for a and b) is the same as if you had calculated the Fibonacci pairs Fib(n + 1) and Fib(n) for each transformation step.

Think of this as “Step 1” for the exercise. It’s important that you do it, because this sequence of calculations becomes relevant later.

Next, it gets into an expanded concept of exponentiation:

In other words, the Fibonacci numbers are produced by applying Tn, the nth power of the transformation T, starting with the pair (1,0).

All the author is trying to do here is recast what you just did (applying the transformation T to itself over and over again) as the concept of taking something to the nth power. It does not change the meaning of what you did at all. All it represents is looking at what you did in a different way; in a way you probably had never imagined was possible. The reason the author wrote this was to help you relate to a concept, which will be introduced later in the exercise text.

Just a little hint here. Since the exercise brought up the issue of exponents in relation to iterations of the transformation T, I noticed that it was helpful later on if I put the corresponding exponent (1, 2, 3, etc.) next to each Fibonacci pair in the part I labeled above, “Step 1”.

Next, it gets really confusing! I don’t know about you, but I think they could’ve written this more clearly. Ready? Here we go!

Now consider T to be the special case of p = 0 and q = 1 in a family of transformations Tpq, where Tpq transforms the pair (a,b) according to abq + aq + ap and bbp + aq.

I had to look at this sentence a bunch of times before I got it. What I got hung up on was the use of the term “T” and the reuse of the variables a and b in the transformation, combined with the fact that the only variables that appeared to be associated with T (on first blush) were p and q. I had to keep in mind, though, that a and b were not used as annotations for the transformation T above, either. I think the reason for the “pq” annotation will become clear as you go through this part.

What the author is really saying is, “Let’s bring in a different kind of transformation we’ll call Tpq. It’s defined as: Tpq: abq + aq + ap and bbp + aq.” The transformation T has not changed. It still exists (separate from Tpq), and it’s still: T: aa + b and ba. What the author is saying is, “If you use (apply) transformation Tpq with the parameters p = 0 and q = 1, the algebraic result is the same as the transformation T.” Go ahead. Give it a try, and convince yourself this statement is true.

This is a setup for the next sentence, which is equally as confusing as the last one:

Show that if we apply such a transformation Tpq twice, the effect is the same as using a single transformation Tp’q’ of the same form, and compute p’ and q’ in terms of p and q.

Okaaaaay. Got it? The way I interpreted this was, “Apply Tpq twice, with the parameters p = 0 and q = 1. Next, let’s introduce a second instance of the formula for Tpq we’ll call “Tp’q’“, with variables p’ and q’ (abq’ + aq’ + ap’ and bbp’ + aq’ — This is where “of the same form” comes from). See if you can find a way to make Tp’q’ equal the second-order transformation of Tpq (the one where you applied the transformation twice with the parameters p = 0 and q = 1). Once you’ve done this, come up with a way to derive the values for p’ and q’ from p and q.” Anyway, this interpretation worked for me. I hope I’ve expressed this clearly. I’m leaving a bit of mystery here to maintain fidelity to the way the original exercise was written.

I’m going to leave this part as a challenge for you to figure out, though the last part of the exercise text (below) gives you a clue about what’s going on here. I’ll just say that everything you have done from the beginning of this exercise up to this point has provided you with all of the information you need to come up with the solution for this.

The exercise finishes with:

This gives us an explicit way to square these transformations, and thus we can compute Tn using successive squaring, as in the fast-expt procedure. Put this all together to complete the following procedure, which runs in a logarithmic number of steps.

A Scheme procedure follows in the text, where most of the code is filled in. All you have to do is fill in the steps where p’ and q’ need to be computed. (Note: I talked about fast-expt earlier here).

Okay, now I think this ending is somewhat misleading. I did not find after doing the above steps that I immediately knew how to do successive squaring in this exercise. It’s going to take some thinking on your part, considering what you’ve done up to this point, to figure out how to do this. What the Scheme code makes obvious is the act of successive squaring takes place solely in the code that computes p’ and q’. I will say that the behavior of successive squaring here is like that of successive squaring in the exercise with fast-expt (Exercise 1.16) in the sense that you’re “jumping” to successive values to get to Fib(n), rather than computing each value as you incrementally approach n.

One hint I’ll give you is once you think you’ve figured out how to compute p’ and q’ in the code, make sure you try out higher values of n with the “fib” procedure in this exercise, like Fib(10), Fib(11), Fib(12), etc., and compare what you get with the known Fibonacci sequence (you can compare it with output of some other version of the “fib” procedure that’s discussed earlier in this section, or you can look up the sequence on the internet). You’ll probably realize that the formulas you came up with are wrong when you do this (I did), and you need to re-evaluate them. The exercise guides you a part of the way to the answer, but makes it sound like you’ve made it most of the way, when you really haven’t.

Maybe I just need to hang out with mathematicians more to see how they use language, but I thought this exercise was poorly written. It seemed to try to hide things by being obscure, to make the problem more challenging, rather than being straightforward and setting forth the challenge in the last step, which is really where it is.

This exercise is worth doing for the beautiful solution that results. It goes to show that no matter how efficient you think your algorithm is, it’s worth exploring to see if you can make it even more efficient, and that mathematics is key to figuring out how to do that.

Read Full Post »

As I’ve gone through SICP I’ve noticed a couple exercises that are very similar to programming problems we were given when I took Comparative Programming Languages in my sophomore year of college, when we were covering Lisp: 8 Queens (in Section 2.2.3 of SICP), and symbolic differentiation (in Section 2.3.2). I wonder if our professor got these problems out of this book, though they could’ve been classic computer science or AI problems. I’m going to focus on the symbolic differentiation exercises in this post.

The problem we were given in Comparative Programming Languages was to create a series of functions that would carry out all of the symbolic differentiation rules (for addition/subtraction, multiplication, division, and exponentiation) for expressions that had two, maybe a few more operands per operator (except for exponents), and simplify the resulting algebraic expressions. Our professor added, “I’ll give extra credit to anyone who implements the Chain Rule.” He gave us some helper functions to use, but we were expected to create the bulk of the code ourselves. In addition we could use no destructive operations (no set or setq). He wanted everything done in a pure functional style right off the bat. We had about a week or so to create a solution. We had only started covering Lisp the week before, and it was the first time in my life I had ever seen it. He never explained to us how a functional language worked. He was one of the hardest CS professors I had. What made it worse was he was terrible at teaching what he was trying to convey. He often came up with these really hard but interesting problems, though.

Creating the rules for addition/subtraction, and exponentiation were pretty easy once I had beaten my head against the wall for a while trying to understand how Lisp worked (I don’t think I ended up implementing the rules for multiplication and division). The hard part, which I didn’t even try, was simplifying the resulting expressions. I didn’t have the skill for it.

Having gone through that experience I was surprised by the treatment SICP gave to this exercise. On the first iteration it gave away almost everything I had tried to figure out myself for this assignment. I was wondering, “Gosh. This seemed like such a challenging book in the first chapter. It’s practically giving away the answers here!”

I’ll give the code for “deriv” as I think it’ll provide some clarity to what I’m talking about:

(define (deriv exp var)
   (cond ((number? exp) 0)
         ((variable? exp)
          (if (same-variable? exp var) 1 0))
         ((sum? exp)
          (make-sum (deriv (addend exp) var)
                    (deriv (augend exp) var)))
         ((product? exp)
          (make-sum
             (make-product (multiplier exp)
                           (deriv (multiplicand exp) var))
             (make-product (deriv (multiplier exp) var)
                           (multiplicand exp))))
         (else (error "unknown type expression -- DERIV" exp))))

The other functions that are needed to make this work are in the book. I modified the “else” clause, because PLTScheme doesn’t implement an “error” function. I use “display” instead.

SICP doesn’t ask the student to implement the differentiation rule for division, or the Chain Rule, just addition/subtraction, multiplication, and one of the exponentiation rules. The point wasn’t to make the exercise something that was hard to implement. It wasn’t about challenging the student on “Do you know how to do symbolic differentiation in code, recognize patterns in the algebra, and figure out how to simplify all expressions no matter what they turn out to be?” It wasn’t about making the student focus on mapping the rules of algebra into the computing domain (though there is some of that). It was about realizing the benefits of abstraction, and what a realization it is!

SICP asked me to do something which I had not been asked to do in all the years that I took CS, and had only occasionally been asked to do in my work. It asked me to take an existing structure and make it do something for which it did not appear to be intended. It felt like hacking, almost a little kludge-ish. I can see why I wasn’t asked to do this in college. What CS generally tried to do at the time was create good and proper programmers, instilling certain best practices. Part of that was writing clear, concise code, which was generally self-explanatory. A significant part of it as well was teaching structured programming, and demanding that students use it, in order to write clear code. That was the thinking.

What amazed me is SICP had me using the same “deriv” function–making no changes to it at all, except to add abstractions for new operations (such as exponentiation and subtraction)–to operate on different kinds of expressions, such as:

(+ (+ (expt x 2) (* 2 x)) 3)

(+ (+ x 3) (* 2 x) 3)

(((x + 3) + (2 * x)) + 3)

(x + 3 + 2 * x + 3) — (with operator precedence)

It could deal with all of them. All that was required was for me to change the implementation beneath the abstractions “deriv” used for each type of expression.

It brought to mind what Alan Kay said about objects–that an object is a computer. The reason I made this connection is the exercises forced me to break the semantic mapping I thought “deriv” was implementing, so that I could see “deriv” as just a function that computed a result–a specialized computer. Kay said that the abstraction is in the messages that are passed between objects in OOP, not in the objects. Here, I could see that the abstraction was in the function calls that were being made inside of “deriv”, not the code of “deriv”. The code is a pattern of use which is structured to follow certain rules. I could look at the function algebraically. In reality, the function names don’t mean anything that’s hard and fast. I thought to myself, “If I changed the implementation beneath the abstractions yet again–still not changing ‘deriv’–I might be able to do useful computing work on a different set of symbols, ones that aren’t even related to classical algebra.”

I recalled a conversation I remember reading online several years ago where a bunch of programmers were arguing over what were the best practices for creating abstractions in OOP. The popular opinion was that the objects should be named after real world objects, and that their behavior should closely model what happened in the real world. One programmer, though, had a different opinion. Whoever it was said that the goal shouldn’t be to structure objects and their relationships according to the real world equivalents, virtualizing the real world. Rather the goal should be to find computational processes that happen to model the real world equivalents well. So in other words, the goal, according to the commenter, should be to find a computer (I use the term “find” broadly, meaning to create one) that does what you need to do, and does it well, not try to take any old computing model and force it into a fashion that you think would model the real world equivalents well. The computer world is not some idealized version of the real world. It is its own world, but there are places where computing and real world processes intersect. The challenge is in finding them.

—Mark Miller, https://tekkie.wordpress.com

Read Full Post »

I want to discuss what’s introduced in Section 2.1.3 of SICP, because I think it’s a really important concept.

When we think of data in typical CS and IT settings, we think of strings and numbers, tables, column names (or a reference from an ordinal value), XML, etc. We think of things like “Item #63A955F006-006”, “Customer name: Barbara Owens”, “Invoice total: $2,336.00”.

How do we access that data? We typically use SQL, either directly or through stored procedures, using relations to serialize/deserialize data to/from a persistence engine that uses indexing (an RDBMS). This returns an organized series of strings and numbers, which are then typically organized into objects for a time by the programmer, or a framework. And then data is input into them, and/or extracted from them to display. We send data over a network, or put it into the persistence engine. Data goes back and forth between containers, each of which use different systems of access, but the data is always considered to be “dead” numbers and strings.

Section 2.1.2 introduces the idea of data abstraction, and I’m sure CS students have long been taught this concept, that data abstraction leads to more maintainable programs because you “program to the interface, not to the object”. This way you can change the underlying implementation without affecting too much how the rest of the program runs. SICP makes the same point.

In 2.1.2 it uses a simple abstraction, a rational number (a fraction), as an example. It defined 3 functions: make-rat, numer, and denom. Make-rat would create a rational number abstraction out of two numbers: a numerator and denominator. Numer and denom return the numerator and denominator from the abstraction, respectively. It calls make-rat a “constructor”, and numer and denom “selectors”.

In 2.1.3 it questions our notion of “data”, though it starts from a higher level than most people would. It asserts that data can be thought of as this constructor and set of selectors, so long as a governing principle is applied which says a relationship must exist between the selectors such that the abstraction operates the same way as the actual concept would. So if you use (in pseudo-code):

x = (make-rat n d)

then (numer x) / (denom x) must always produce the same result as n / d. The book doesn’t mention this at this point, but in order to be really useful, the abstraction would need to work the same way that a rational number would when arithmetic operators are applied to it.

So rather than creating a decimal out of a fraction, as would be the case in most languages:

x = 1 / 4 (which equals 0.25)

You can instead say:

x = (make-rat 1 4)

and this representation does not have to be a decimal, just the ratio 1 over 4, but it can be used in the same way arithmetically as the decimal value. In other words, it is data!

In Section 2.1.2, the abstraction used for a rational number was a “pair”, a cons structure (created by executing (cons a b)). In 2.1.3 it shows that the same thing that was done to create a rational number data abstraction can be applied to the concept of a “pair”:

This point of view can serve to define not only “high-level” data objects, such as rational numbers, but lower-level objects as well. Consider the notion of a pair, which we used in order to define our rational numbers. We never actually said what a pair was, only that the language supplied procedures cons, car, and cdr for operating on pairs. But the only thing we need to know about these three operations is that if we glue two objects together using cons we can retrieve the objects using car and cdr. That is, the operations satisfy the condition that, for any objects x and y, if z is (cons x y) then (car z) is x and (cdr z) is y. Indeed, we mentioned that these three procedures are included as primitives in our language. However, any triple of procedures that satisfies the above condition can be used as the basis for implementing pairs. This point is illustrated strikingly by the fact that we could implement cons, car, and cdr without using any data structures at all but only using procedures. Here are the definitions:

(define (cons x y)
  (define (dispatch m)
    (cond ((= m 0) x)
          ((= m 1) y)
          (else (error "Argument not 0 or 1 -- CONS" m))))
  dispatch) 

(define (car z) (z 0)) 

(define (cdr z) (z 1))

This use of procedures corresponds to nothing like our intuitive notion of what data should be. Nevertheless, all we need to do to show that this is a valid way to represent pairs is to verify that these procedures satisfy the condition given above.

This example also demonstrates that the ability to manipulate procedures as objects automatically provides the ability to represent compound data. This may seem a curiosity now, but procedural representations of data will play a central role in our programming repertoire. This style of programming is often called message passing.

So, just to make it clear, what “cons” does is construct a procedure that “cons” calls “dispatch” (I say it this way, because “dispatch” is defined in cons’s scope), and return that procedure to the caller of “cons”. The procedure that is returned can then be used as data since the above “car” and “cdr” procedures produce the same result as the “car” and “cdr” procedures we’ve been using in the Scheme language.

SICP mentions in this section that message passing will be used further as a basic tool in Chapter 3 for more advanced concepts. Note that I have talked about message passing before. It is central to the power of object-oriented programming.

Note as well that the text implies an analog between procedures, in a Lisp language, and objects. The above example is perhaps a bad way to write code to get that across. In the corresponding lecture for this section (Lecture 2b, “Compound Data”) Abelson used a lambda ( (lambda (m) (cond …)) ) instead of “(define (dispatch m) …)”. What they meant to get across is that just as you can use “cons” multiple times in Scheme to create as many pairs as you want, you can use this rendition of “cons” to do the same. Instead of creating multiple blobs of memory which just contain symbols and pointers (as pairs are traditionally understood), you’re creating multiple instances of the same procedure, each one with the different pair values that were assigned embedded inside them. These procedure instances exist as their own entities. There is not just one procedure established in memory called “dispatch”, which is called with pieces of memory to process, or anything like that. These procedure instances are entities “floating” in memory, and are activated when parameters are passed to them (as the above “car” and “cdr” do). These procedures are likewise garbage collected when nothing is referring to them anymore (they pass out of existence).

In the above example, “cons” is the constructor (of a procedure/object called “dispatch”), and “car” and “cdr” are the selectors. What we have here is a primitive form of object-orientation, similar to what is known in the Smalltalk world, with the mechanics of method dispatch exposed. The inference from this is that objects can also be data.

What SICP begins to introduce here is a blurring of the lines between code and data, though at this point it only goes one way, “Code is data.” Perhaps at some point down the road it will get to sophisticated notions of the inverse: “Data is code”. The book got into this concept in the first chapter just a little, but it was really simple stuff.

Getting back to our traditional notions of data, I hope this will begin to show how limited those notions are. If procedures and objects can be data, why can’t we transmit, receive, persist, and search for them just as we can with “dead” data (strings and numbers)? In our traditional notions of CS and IT (assuming they’re focused on using OOP) we use classes to define methods for dealing with data, and elemental interactions with other computing functions, such as data storage, display, interaction between users and other computers, etc. We have a clear notion of separating code (functionality) from data, with the exception of objects that get created while a program is running. In that case we put a temporary “persona” over the real data to make it easier to deal with, but the “persona” is removed when data is stored and/or we exit the program. What we in the whole field neglect is that objects can be data, but they can only really be such if we can do everything with them that we now do with the basic “dead” forms of data.

There is at least one solution out there that is designed for enterprises, and which provides inherent object persistence and search capabilities, called Gemstone. It’s been around for 20+ years. The thing is most people in IT have never heard of it, and would probably find the concept too scary and/or strange to accept. From what I’ve read, it sounds like a good concept. I think such a system would make application design, and even IT system design if it was done right, a lot more straightforward, at least for systems where OO architecture is a good fit. Disclaimer: I do not work for Gemstone, and I’m not attempting to advertise for them.

As I think about this more advanced concept, I can see some of where the language designers of Smalltalk were going with it. In it, objects and their classes, containing information, can be persisted and retrieved in the system very easily. A weakness I notice now in the language is there’s no representational way to create instance-specific functionality, as exists in the above Scheme example. Note that the values that are “consed” are inlined in “dispatch”. There’s a way to put instance-specific code into an object in Smalltalk, but it involves creating a closure–an anonymous object–and you would essentially have to copy the code for “dispatch”. You would have to treat Smalltalk like a functional language to get this ability. The reason this interests me is I remember Alan Kay saying once that he doesn’t like the idea of state being stored inside of an object. Well, the above Scheme example is a good one for demonstrating “stateless OOP”.

Edit 7-9-2010: After I posted this article I got the idea to kind of answer my own question, because it felt insensitive of me to just leave it hanging out there when there are some answers that people already know. The question was, “Why don’t we have systems of data that allow us to store, retrieve, search, and transmit procedures, objects, etc. as data?” I’ve addressed this issue before from a different perspective. Alan Kay briefly got into it in one of his speeches that’s been on the internet. Now I’m expanding on it.

The question was somewhat rhetorical to begin with. It’s become apparent to me as I’ve looked at my programming experience from a different perspective that a few major things have prevented it from being resolved. One is we view computers as devices for automation, not as a new medium. So the idea of computers adding meta-knowledge to information, and that this meta-knowledge would be valuable to the user is foreign, though slowly it’s becoming less so with time, particularly in the consumer space. What’s missing is a confidence that end users will be capable of, and want to, manipulate and modify that meta-knowledge. A second reason is that each language has had its own internal modeling system, and for quite a while language designers have gotten away with this, because most people weren’t paying attention to that. This is beginning to change, as we can see in the .Net and Java worlds, where a common modeling system for languages is preferred. A third reason is that with the introduction of the internet, openness to programming is seen as a security risk. This is because of a) bad system design, and b) in the name of trying to be “all things to all people” we have systems that try to compromise, introducing some risk, to promote backward compatibility with older models of interaction that became less secure when wide scale networking was introduced, and to cater to programmers’ desires for latitude and features so that the platform would be more widely accepted, so the thinking goes.

No one’s figured out a way to robustly allow any one modeling system to interact or integrate with other modeling systems. There have been solutions developed to allow different modeling systems to interact, either via. XML using networking protocols, or by developing other languages on top of a runtime. I’ve seen efforts to transmit objects between two specific runtimes: .Net and the JVM, but this area of research is not well developed. Each modeling system creates its own “walled garden”, usually with some sort of “escape hatch” to allow a program written in one system to interact with another (usually using a C interface or some FFI (foreign functionality interface)) which isn’t that satisfactory. It’s become a bit like what used to be a plethora of word processing or spreadsheet document file formats, each incompatible with the other.

Programmers find file format standards easier to deal with, because it involves loading binary data into a data structure, and/or parsing a text file into such a structure, but this limits what can be done with the information that’s in a document, because it takes away control from the recipient in how the information can be used and presented. As I’ve discussed earlier, even our notions of “document” are limiting. Even though it could serve a greater purpose, it seems not much time has been invested in trying to make different modeling systems seamless between each other. I’m not saying it’s easy, but it would help unleash the power of computing.

Code is encoded knowledge. Knowledge should be fungible such that if you want to take advantage of it in a modeling system that is different from the one it was originally encoded in, it should still be accessible through some structure that is compatible with the target system. I understand that if the modeling system is low level, or more primitive, the result is not going to look pretty, but it could still be accessed through some interface. What I think should be a goal is a system that makes such intermingling easy to create, rather than trying to create “runtime adapters”. This would involve understanding what makes a modeling system possible, and focusing on creating a “system for modeling systems”. You can see people striving for this in a way with the creation of alternative languages that run on the JVM, but which have features that are difficult to include with the Java language itself. Java is just used as a base layer upon which more advanced modeling systems can be built. This has happened with C as well, but Java at least allows (though the support is minimal) a relatively more dynamic environment. What I’m talking about would be in a similar vein, except it would allow for more “machine models” to be created, as different modeling systems expect to run on different machine architectures, regardless of whether they’re real or virtual.

I think the reason for this complexity is, as Alan Kay has said, we haven’t figured out what computing is yet. So what language designers create reduces everything down to one or more aspects of computing, which is embedded in the design of languages.

So my question is really a challenge for future reference.

— Mark Miller, https://tekkie.wordpress.com

Read Full Post »

SICP Exercise 2.32

This is a real interesting one! I highly recommend working on it. I rank it up there with Exercise 1.11 in that once you get the answer, you’ll be amazed that the method really works. This is the kind of problem where functional programming really shines. All of the rules for it make the solution work out beautifully.

The end result, what should come out of the “subsets” function, comes from the classical mathematical notion of subsets. The set of subsets for a set contains at least two members: the original set itself, and the empty set. You see that in the example that’s given. Since order does not matter in a set, you don’t need to deal with repetitions, like (1 2) and (2 1). In fact, such repetitions should not show up in the result.

I’m not totally positive about this, but I think the way you achieve the answer has nothing to do with classical mathematics. Once again, I think this is solved using computing processes, not ones from classical mathematics. In that way it’s like Exercise 1.11 as well.

What I did to begin solving this was to look at the logic I was given, and do a walkthrough of it until I reached the point where I needed to fill in my own logic. I found it helpful to use the example set, and compare what I was getting to the expected result. As you back out of the recursion, look at what can be produced by using “map” with some logic that you think makes sense at each stage. And let your own logic evolve, if need be, to what you think you need to produce at each stage.

Happy programming!

Read Full Post »

I’ve jumped ahead some, since I’m going through the book with a local Lisp users group. They’ve decided to not go through all of the exercises, just some that we select. I may backtrack at some point to cover some of the ones I missed.

I won’t say much about this exercise. If it mystifies you from the start, look up information on “lambda calculus”. I had to review this as well before I started into it, because it had been a couple years since I last looked at stuff like this.

I found that it didn’t help at all to try to execute stuff in a Scheme interpreter. That misses the point. I found it helpful to just write down, or type out in a text editor, a walkthrough of the logic in the “zero” and “add-1” functions as I used them, to figure out part of the exercise.

Getting to “one” and “two” was pretty easy. What the exercise says to do (it says “use substitution”) is use the functions you are given to figure out what “one” and “two” are. In other words, carry out the logic and see what you get.

The challenging part is figuring out what the “+” operation should be. It helps to look at this at a conceptual level. You don’t have to do a math proof to get this, though understanding how to do an inductive proof helps in giving you ideas for how to proceed with this part (if you need practice with doing an inductive proof, try Exercise 1.13). Keep in mind that this is a computing problem, not a classical mathematics problem (though there is definitely mathematics of a sort going on as you go through the logic).

The idea of the exercise is to get a different idea of what the concept of number can mean via. computing. It helps to be familiar with mathematical logic for this one, to see what you’re really doing, though I think you can get by without it if that’s not one of your competencies. If you’ve ever seen a mathematical explanation for the existence of natural numbers, this is reminiscent of that.

I don’t want to give away the answer, except to mention an article I found, called “The Genius of Alonzo Church (rerun)”, by Mark Chu-Carroll. NOTE: Do not read this article until you’ve done the problem! It will give away the answer. I only mention it because Mark says something quite beautiful about what the answer means. It ties right into a concept in object-oriented programming.

Read Full Post »

This problem challenged me to really think about the efficiency of my algorithm. I have trouble relating arithmetic notions to computing, so understanding the information that the authors had laid out was a problem for me at first.

Whatever solution you come up with for this, be sure to check its Big-O function. I thought I had written a novel solution, based on the ideas in the book. What I found after checking the number of steps it took was that its performance was better than linear, but it was close to linear. I wasn’t satisfied with that. So I looked at ways of making it more efficient, and found that not only was I inefficient in my process, but I was also inefficient with my code. You need to put the focus on successive squaring. A good working knowledge of what exponents represent will work in your favor.

I thought at first that I needed two functions: one for even powers, and one for odd, because I figured out different implementations for each. Then I found that I needed only one function for computing an exponent. There’s a hint about that in the text: bn = b ⋅ bn-1.

I found the hints written into the exercise a bit confusing, but now that I’ve solved it I see what they were getting at. To me, they turned out to be somewhat helpful, but also somewhat misleading. They tried to be helpful, but they didn’t want to reveal too much. In the end I think they were “too smart” for their own good.

One of the hints is an abstraction. One of the hints relates to a concept that was discussed earlier, and it’s just supposed to help you see it more clearly so you can use the idea in your code. Don’t think that you need to translate all of the hints literally into equivalent code. Use them as inspiration–use what you find useful at the moment. If it doesn’t make sense after thinking about it for a bit, just put it out of your mind. Focus on developing what you think will solve the problem. Once you get something going, see how it does performance-wise, and then perhaps reconsider the hints. Don’t be thinking about, “Did I put all of the hints in my code correctly?” Once you get the problem solved, you’ll see as I did what the hints were all about.

In one of my early versions of a solution, I noticed it was making “leaps and bounds” using successive squaring in computing the answer, and then I had it slow down quite a bit because I was trying to get to an intermediate step, which the successive squaring technique would overshoot. I’m being a bit misleading by saying this, but I don’t want to give too much away. Look at the gap between where successive squaring gets you, and where you need to go, and think of how to get through that gap more quickly. The hint I’ll give you is: once you’ve gotten to the point of considering this, the answer is already within your grasp.

Read Full Post »

This is the first in what I hope will be many posts talking about Structure and Interpretation of Computer Programs, by Abelson and Sussman. This and future posts on this subject will be based on the online (second) edition that’s available for free. I started in on this book last year, but I haven’t been posting about it because I hadn’t gotten to any interesting parts. Now I finally have.

After I solved this problem, I really felt that this scene from The Matrix, “There is no spoon” (video), captured the experience of what it was like to finally realize the solution. First realize the truth, that “there is no spoon”–separate form from appearance (referring to Plato’s notion of “forms”). Once you do that, you can come to see, “It is not the spoon that bends. It is only yourself.” I know, I’m gettin’ “totally trippin’ deep, man,” but it was a real trip to do this problem!

This exercise is an interesting look at how programmers such as myself have viewed how we should practice our craft. It starts off innocuously:

A function f is defined by the rule that f(n) = n if n < 3 and f(n) = f(n – 1) + 2f(n – 2) + 3f(n – 3) if n >= 3. Write a procedure that computes f by means of a recursive process. Write a procedure that computes f by an iterative process.

(Update 5-22-2010: Don’t read too much into the “cross-outs” I’m using. I’m just trying to be more precise in my description.) Writing it recursively is pretty easy. You just do a translation from the mathematical classical algebraic notation to Scheme code. It speaks for itself. The challenging part is writing the same thing a solution that gives you the same result using an iterative process. Before you say it can’t be done, it can!

I won’t say never, but you probably won’t see me describing how I solve any of the exercises, because that makes it too easy for CS students to just read this and copy it. I want people to learn this stuff for themselves. I will, however, try to give some “nudges” in the right direction.

  • I’ll say this right off the bat: This is not a (classical) mathematical an algebra problem you’re dealing with. It’s easy to fall into thinking this, especially since you’re presented with some algebra as something to implement. This is a computing problem. I’d venture to say it’s more about the mathematics of computing than it is about algebra. As developers we often think of the ideal as “expressing the code in a way that means something to us.” While this is ideal, it can get in the way of writing something optimally. This is one of those cases. I spent several hours trying to optimize the math algebraic operations, and looking for mathematical patterns that might optimize the recursive algorithm into an iterative one. Most of my mathematical the patterns I thought I had fell to pieces. It was a waste of time anyway. I did a fair amount of fooling myself into thinking that I had created an iterative algorithm when I hadn’t. It turned out to be the same recursive algorithm done differently.
  • Pay attention to the examples in Section 1.2.1 (Linear Recursion and Iteration), and particularly Section 1.2.2 (Tree Recursion). Notice the difference, in the broadest sense, in the design between the recursive and iterative algorithms in the sample code.
  • As developers we’re used to thinking about dividing code into component parts, and that this is the right way to do it. The recursive algorithm lends itself to that kind of thinking. Think about information flow instead for the iterative algorithm. Think about what computers do beyond calculation, and get beyond the idea of calling a function to get a desired result.
  • It’s good to take a look at how the recursive algorithm works to get some ideas about how to implement the iterative version. Try some example walk-throughs with the recursive code and watch what develops.
  • Here’s a “you missed your turn” signpost (using a driving analogy): If you’re repeating calculation steps in your iterative algorithm, you’re probably not writing an iterative procedure. The authors described the characteristics of a recursive and an iterative procedure earlier in the chapter. See how well your procedure fits into either description. There’s a fine line between what’s “iterative” and what’s “recursive” in Scheme, because in both cases you have a function calling itself. The difference is in a recursive procedure you tend to have the function calling itself more than once in the same expression. Whereas In an iterative procedure you tend to have a simpler calling scheme where the function only calls itself once in an expression (though it may call itself once in more than one expression inside the function), and all you’re doing is computing an intermediate result, and incrementing a counter, to input into the next step. You should not have operators which are “lingering”, waiting for a function call to return, as a rule, though they did show an example earlier of a function that was mostly iterative, with a little recursion thrown in now and then. With this exercise I found that I was able to find a solution that was strictly iterative.
  • As I believe was mentioned earlier in this chapter (in the SICP book), remember to use the parameter list of your function as a means for changing state.
  • I will say that both the recursive and the iterative functions are pretty simple, though the iterative function uses a couple concepts that most developers are not used to. That’s why it’s tricky. When I finally got it, I thought, “Oh! There it is! That’s cool.” When you get it, it will just seem to flow very smoothly.

Happy coding!

Read Full Post »