In “Burn Math Class”, Jason Wilkes spends quite a few pages deriving the value of \(e\). I did not notice him at any point mentioning compound interest. Since we’re currently wrapping up the chapter on exponential functions and logarithms in the Algebra II classes I’m teaching, I was already thinking about the best way to introduce \(e\).
\(e\) is a strange bird: As a numeric value on its own, it’s only of passing mathematical interest. What makes it most useful is when it’s the base of the function \(e^x\) and the logarithm \(\ln x\). So I can see how Wilkes concludes that it’s best to derive it using a Taylor series and derivatives, but I don’t agree with that as the first approach.
First, here are some ways we could introduce it:
- By fiat. It’s a number. We won’t tell you why it’s important. Just learn it and get on with it. This is the basic approach in Prentice Hall’s Algebra 2 with Trigonometry (2001, p. 559): “One of the most important numbers, an irrational number called \(e\), coours in advanced mathematics, economics, statistics, probability, and in many situations involving growth. \(e\approx2.718281828450…\).”
- Using the compound interest formula and hinting at limits. This is reflected in Holt’s Algebra 2 (2007, p. 531): “Suppose that $1 is invested at 100% interest (\(r=1\)) compounded \(n\) times for one year as represented by the function \(f(n)=\left(1+\frac{1}{n}\right)^n\). As \(n\) gets very large, interest is continuously compounded. Examine the graph of \(f(n)=\left(1+\frac{1}{n}\right)^n\). The function has a horizontal asymptote. As \(n\) becomes infinitely large, the value of the function approaches approximately 2.7182818…. This number is called \(e\). Like \(\pi\), the constant \(e\) is an irrational number.”
- Using limits, possibly but not necessarily referencing the compound interest formula. Here’s the relevant passage from McDougal Littell’s Algebra 2 (2008, p. 492): ” The history of mathematics is marked by the discovery of special numbers such as \(\pi\) and \(i\). Another special numbers is denoted by the letter \(e\). The number is called the natural base \(e\) or the Euler number after its discoverer, Leonhard Euler (1707-1783). The expression \(\left(1+\frac{1}{n}\right)^n\) approaches \(e\) and \(n\) increases.” Below this is a table showing the value of this expression at \(n = 10^1, 10^2, 10^3, 10^4, 10^5, 10^6\), then a Key Concept box that says, “The natural base \(e\) is irrational. It is defined as follows: As \(n\) approaches \(+\infty\), \(\left(1+\frac{1}{n}\right)^n\) approaches \(e \approx 2.718281828.\)” Two pages later, there’s another Key Concept box that ties the compound interest formula to \(A = Pe^{rt}\).
- Using derivatives and then limits. This is how Wilkes supports his first formula for \(e\): He uses the derivative of \(e^x\) to build \(\lim\limits_{n\rightarrow\infty}\left(1+\frac{1}{n}\right)^n\). I’ll explain that below.
- Using derivatives to build a Taylor series. This is how Wilkes derives his first formula for \(e\). I’ll also explain this below.
I appreciate Wilkes’s enthusiasm about calculus, but I think he overdoes it in claiming that it’s basically impossible to understand the importance of the natural base without derivatives. His actual claim is that it’s necessary to have basic calculus, but the implication is that, if you don’t understand at least the basics of derivatives, you don’t understand calculus.
Both \(e\) and \(\pi\) first came up with regards to limits. They were estimated, fairly decently, without derivatives. It’s true that the Taylor series that we get through derivation converges much faster than the older methods, but that doesn’t change that there were years or, in the case of \(\pi\), centuries where the sole understanding of the value was as a limit.
My clearest understanding of \(e\) as a numeric value comes from an understanding of limits. It is certainly awesome that the function that must exist happens to be \(e^x\), and that its inverse function fills another gap, but neither of these facts is necessary to an understanding of the specific value of \(e\).
Historically, the first explicit reference to \(e\) appears to come from Jacob Bernoulli asking “what if?” with regards to the compound interest formula, \(\left(1 + \frac{r}{n}\right)^{nt}\). The question: What if we compounded this continuously? To make things easier, we’ll use \(r = 1\). This does converge, but it’s pretty slow; when \(n = 1000\), we get about 2.716923932. For a million, we get about 2.718280469. That’s a lot of error.
Historically, the first careful method for calculating \(\pi\) comes from calculating the perimeters of regular polygons that fit inside or around a circle: As we double the number of sides of the polygon, we can use trigonometry to find the new perimeter, which gets closer and closer to a value. When doing this by hand, it’s a lot of work. An inscribed polygon with a radius of 1 and 1536 sides has a semiperimeter of around 3.141590. Again, that’s a lot of error for the amount of work.
I think that students can understand the basics of what a limit is without going into the nitty-gritty of calculus and infinities. Graph something: Notice that it gets closer and closer to a value. That’s a limit. If anything, we should be spending more time stressing limits as a foundation for calculus. At the same time, though, there are interesting things to be learned from using derivatives.
Using derivatives and then limits
Wilkes is all about going backwards, so I’m going to go backwards from his backwardsness by starting with his second method. Specifically, he asks: How do we find the derivative of \(e^x\) using the standard approach, i.e., that we’re taking the limit of \(\frac{f(x + h) – f(x)}{h}\) as \(h\) approaches 0?
This reflects a very nifty set of steps, and strikes me as an excellent way of helping students see the true power and flexibility of derivatives. It’s important to note that Wilkes is assuming there is some function, \(f(x)\), which is its own derivative, and calling this function \(e^x\). That is, he’s not starting with \(e = 2.71828…\) and looking at the properties of \(e^x\): He’s starting with a function that has a given property, \(f(x) = f'(x)\), and having earlier proven that it has to be of the form \(b^x\), is now trying to calculate the value of \(b\). Backwards.
And backwards is fine, of course, because the point is to prove that two things are identical. If two things are identical, it doesn’t really matter much which direction we start in.
So we start here: \[(e^x)’ = \frac{e^{x + h} – e^x}{h}\]
Given that \((e^x)’ = e^x\) and that \(a^{b + c} = a^b + a^c\), we can rewrite this to: \[e^x = \frac{e^xe^h – e^x}{h} = e^x \frac{e^h – 1}{h} \Rightarrow \frac{e^h – 1}{h} = 1 \\ \Rightarrow e^h – 1 = h \Rightarrow e^h = 1 + h\]
For the next step, take the \(h\)th root of each side, that is, \(e = \left(1 + h\right) ^ \frac{1}{h} \). As \(h\) gets smaller, \(1/h\) gets larger. Replacing \(n = \frac{1}{h}\) gives us \(\left(1+\frac{1}{n}\right)^n\), which is Bernoulli’s formula for calculating \(e\).
There’s quite a bit to unpack here. The details require a firm understanding of exponent power rules, variable manipulation, the relationship between a number and its reciprocal, and naturally derivatives. And it only works on the assumption that \(f(x) = f'(x)\), so its sole solution is the only function of the form \(b^x\) that is its own derivative.
Using derivatives to build a Taylor series
Before reading Wilkes’s analysis, I wasn’t really sure what was going on with Taylor series. And I’m skeptical that it was Wilkes’s exposition per se that helped me out, so much as the key takeaway, which is this:
Assume that all functions can be written as a polynomial.
That’s it. Beautifully simple and intricate at the same time.
Here’s how it works with \(f(x) = e^x\). Again, we’re starting with the knowledge that \(f(x) = f'(x)\) and we’re trying to find the value of \(e\), not the other way around.
First, we need an understanding of the Power Rule for derivatives. If \(f(x) = x^n\), then \(f'(x) = nx^{n-1}\), \(f”(x) = (n)(n-1)x^{n-2}\), and so on. Using this, Wilkes demonstrates (p. 200-202) that any function can be written at \[f(x) = f(0) + \frac{f'(0)}{1!}x + \frac{f”(0)}{2!}x^2 + \frac{f”'(0)}{3!}x^3 + \frac{f””(0)}{4!}x^4 + …\] ad infinitum.
Since we want a function whose derivative is itself, that means that \(f(x) = f'(x) = f”(x) = f”'(x) …\). Since this function is of the form \(b^x\), that means \(f(0) = f'(0) = f”(0) = f”'(0) = … = 1\). In other words, \[e^x = 1 + \frac{x}{1!} + \frac{x^2}{2!} + \frac{x^3}{3!} + \frac{x^4}{4!} + …\]
If we want the specific value of \(e\), we look at \(x = 1\), that is: 1 + 1 + 1/2 + 1/6 + 1/24 + …
This converges much faster. By \(n = 10\), we have 2.718281526. This is more accurate than the earlier formula gives us at one million.
I see the beauty of this explanation, and now I feel that I better understand Taylor series. But I feel like this is too complex: I still think using limits in a table or a graph, as Algebra 2 textbooks generally do now, is more accessible to most students. In the past, textbooks did err by simply announcing the value of \(e\) with little context, and assuming students would obediently use it.
The Other Gap
\(e^x\) is a wonderful function, but its inverse, \(\ln x\), has a useful attribute in its own right.
Going back to the Power Rule, we see that we can output just about any function of the form \(x^n\) as the derivative of another function of the form \(ax^{n+1}\). That is, if I say, “I want a function whose derivative is \(x^{20}\),” I can use the Power Rule to create it, really easily: \(f(x) = \frac{x^{21}}{21} \Rightarrow f'(x) = x^20\).
It works for just about any real number. If I want a function whose derivative is \(x^{-3.2}\), well, that’s just \(-\frac{x^{-2.2}}{2.2}\). Just add one to the exponent, and divide by the new exponent.
“Just about”: It doesn’t work if I want a function whose derivative is \(x^{-1}\), because that strategy gives us \(\frac{x^0}{0}\), which is undefined.
However, the derivative of \(\ln(x)\) is \(x^{-1}\), filling the gap.
Incidentally, this suggests that \(x^a\) should somehow converge on \(\ln(x)\) as \(a\) gets smaller and smaller… and it doesn’t. Instead, \(x^a\) gets farther away as \(a\) gets smaller. So what gives?
While \(x^a\) gets farther from \(\ln(x)\), so does \(x^{-a}\)… in the opposite direction. If we take the average of those two values, we get something whose limit is indeed \(\ln(x)\). The two functions go towards infinity in their opposite directions at such a rate that the result is the natural logarithm function. Check it out for yourself.