SICP: Exercise 1.19 – optimizing Fibonacci

I’m backtracking with this post. My last SICP post talked about sections 2.2 and 2.3. I’m now going back to Section 1.2 to cover an exercise I had not done until recently. I may get into more of them. Time will tell.

One of the things Section 1.2 talks about is how to calculate Fibonacci numbers recursively and iteratively. The recursive procedure is O(Φn), and the iterative method is O(n). Another concept this section brings up over and over again is exponentiation, and this exercise prompts you to look at it in a new way: It just means repeating something n times, and the operation doesn’t necessarily have to do with multiplication.

The exercise says there’s an algorithm for calculating Fibonacci numbers in O(log n). I had a few false starts with this. The only way I was able to figure it out was to go through the text of the exercise slowly and methodically, and do what each part prompted me to do, before going on to the next sentence. I think once again my ignorance of mathematics came out, because it looks like this exercise was written by a mathematician. The author had a way of expressing things that was foreign to me, and so I had to spend some time teasing out what they meant. I’m going to break up the exercise text into “bite size” pieces to clarify what’s being communicated.

The first part I worked on was:

Recall the transformation of the state variables a and b in the fib-iter process of section 1.2.2: aa + b and ba. Call this transformation T, and observe that applying T over and over again n times, starting with 1 and 0, produces the pair Fib(n + 1) and Fib(n).

I’m not going to give you the play-by-play of this step, but I will clarify what the author is saying. Where it says, “Call this transformation T“, it means keep this association in mind:

T: aa + b and ba (In other words, when you think of T, think of the transformation aa + b and ba)

By “transformation” they mean, for example, “The new value of a (to the left of the arrow) is derived from the formula on the right side of the arrow.” The same goes for b. Where it says, “observe that applying T over and over again n times, starting with 1 and 0, produces the pair Fib(n + 1) and Fib(n),” it means calculate the values for a and b by using the transformation T and apply the results to itself over and over again, starting with the values a = 1, and b = 0. Observe that the pattern of pairs (the values for a and b) is the same as if you had calculated the Fibonacci pairs Fib(n + 1) and Fib(n) for each transformation step.

Think of this as “Step 1” for the exercise. It’s important that you do it, because this sequence of calculations becomes relevant later.

Next, it gets into an expanded concept of exponentiation:

In other words, the Fibonacci numbers are produced by applying Tn, the nth power of the transformation T, starting with the pair (1,0).

All the author is trying to do here is recast what you just did (applying the transformation T to itself over and over again) as the concept of taking something to the nth power. It does not change the meaning of what you did at all. All it represents is looking at what you did in a different way; in a way you probably had never imagined was possible. The reason the author wrote this was to help you relate to a concept, which will be introduced later in the exercise text.

Just a little hint here. Since the exercise brought up the issue of exponents in relation to iterations of the transformation T, I noticed that it was helpful later on if I put the corresponding exponent (1, 2, 3, etc.) next to each Fibonacci pair in the part I labeled above, “Step 1”.

Next, it gets really confusing! I don’t know about you, but I think they could’ve written this more clearly. Ready? Here we go!

Now consider T to be the special case of p = 0 and q = 1 in a family of transformations Tpq, where Tpq transforms the pair (a,b) according to abq + aq + ap and bbp + aq.

I had to look at this sentence a bunch of times before I got it. What I got hung up on was the use of the term “T” and the reuse of the variables a and b in the transformation, combined with the fact that the only variables that appeared to be associated with T (on first blush) were p and q. I had to keep in mind, though, that a and b were not used as annotations for the transformation T above, either. I think the reason for the “pq” annotation will become clear as you go through this part.

What the author is really saying is, “Let’s bring in a different kind of transformation we’ll call Tpq. It’s defined as: Tpq: abq + aq + ap and bbp + aq.” The transformation T has not changed. It still exists (separate from Tpq), and it’s still: T: aa + b and ba. What the author is saying is, “If you use (apply) transformation Tpq with the parameters p = 0 and q = 1, the algebraic result is the same as the transformation T.” Go ahead. Give it a try, and convince yourself this statement is true.

This is a setup for the next sentence, which is equally as confusing as the last one:

Show that if we apply such a transformation Tpq twice, the effect is the same as using a single transformation Tp’q’ of the same form, and compute p’ and q’ in terms of p and q.

Okaaaaay. Got it? The way I interpreted this was, “Apply Tpq twice, with the parameters p = 0 and q = 1. Next, let’s introduce a second instance of the formula for Tpq we’ll call “Tp’q’“, with variables p’ and q’ (abq’ + aq’ + ap’ and bbp’ + aq’ — This is where “of the same form” comes from). See if you can find a way to make Tp’q’ equal the second-order transformation of Tpq (the one where you applied the transformation twice with the parameters p = 0 and q = 1). Once you’ve done this, come up with a way to derive the values for p’ and q’ from p and q.” Anyway, this interpretation worked for me. I hope I’ve expressed this clearly. I’m leaving a bit of mystery here to maintain fidelity to the way the original exercise was written.

I’m going to leave this part as a challenge for you to figure out, though the last part of the exercise text (below) gives you a clue about what’s going on here. I’ll just say that everything you have done from the beginning of this exercise up to this point has provided you with all of the information you need to come up with the solution for this.

The exercise finishes with:

This gives us an explicit way to square these transformations, and thus we can compute Tn using successive squaring, as in the fast-expt procedure. Put this all together to complete the following procedure, which runs in a logarithmic number of steps.

A Scheme procedure follows in the text, where most of the code is filled in. All you have to do is fill in the steps where p’ and q’ need to be computed. (Note: I talked about fast-expt earlier here).

Okay, now I think this ending is somewhat misleading. I did not find after doing the above steps that I immediately knew how to do successive squaring in this exercise. It’s going to take some thinking on your part, considering what you’ve done up to this point, to figure out how to do this. What the Scheme code makes obvious is the act of successive squaring takes place solely in the code that computes p’ and q’. I will say that the behavior of successive squaring here is like that of successive squaring in the exercise with fast-expt (Exercise 1.16) in the sense that you’re “jumping” to successive values to get to Fib(n), rather than computing each value as you incrementally approach n.

One hint I’ll give you is once you think you’ve figured out how to compute p’ and q’ in the code, make sure you try out higher values of n with the “fib” procedure in this exercise, like Fib(10), Fib(11), Fib(12), etc., and compare what you get with the known Fibonacci sequence (you can compare it with output of some other version of the “fib” procedure that’s discussed earlier in this section, or you can look up the sequence on the internet). You’ll probably realize that the formulas you came up with are wrong when you do this (I did), and you need to re-evaluate them. The exercise guides you a part of the way to the answer, but makes it sound like you’ve made it most of the way, when you really haven’t.

Maybe I just need to hang out with mathematicians more to see how they use language, but I thought this exercise was poorly written. It seemed to try to hide things by being obscure, to make the problem more challenging, rather than being straightforward and setting forth the challenge in the last step, which is really where it is.

This exercise is worth doing for the beautiful solution that results. It goes to show that no matter how efficient you think your algorithm is, it’s worth exploring to see if you can make it even more efficient, and that mathematics is key to figuring out how to do that.