I’ll keep this one short. Also, it’s on calculus, for whatever that’s worth.

The Power Rule for Differentiation says that the derivative of a monomial \(ax^b\) is \(abx^{b-1}\). Last night I noticed a way to derive this for positive integers that I believe I’ve seen before (so I’m not claiming originality), but which is different from the standard way of doing so.

First of all, though, the context: I wasn’t even thinking about calculus! I was thinking about the standard linear equation, which is typically written in the United States as \(y=mx+b\).

I have a longer post simmering in my head about that, but for now let’s just look at each piece. It represents a relationship between four key concepts involved in a mathematical model:

**The initial value,**This doesn’t necessarily mean that there were no values before, but rather that we’re going to select some point in time and declare it to be the “zero” point.*b*.**The rate of change,***m.***Some independent value,***x.***Some dependent value,**This is often called the output.*y*.

Note that I am using the function-based terms, not the graphical terms (respectively: the \(y\)-intercept, the slope, the \(x\) variable, and the \(y\) variable).

We tend to think of \(b\) and \(m\) as constants and \(x\) and \(y\) as variables, and I’ll get into that in more detail in my other post, but in the interest of (relative) brevity, I’ll just say: It doesn’t have to be that way.

First of all, if \(m\) is constant, than the rate of change is likewise constant, which follows directly from applying the Power Rule to \(mx\): The predicted derivative is \(1mx^0 = m\).

So I was thinking: What happens when \(m\) in particular is a variable? We could do a few things with that, but perhaps the simplest is to make it a variable that’s dependent on \(x\). And since the simplest function (other than a plain constant) is \(y=x\), let’s use that.

Let’s write a function where the rate of change of the function is directly proportional to the change of the independent variable. That is, \(y · x + b\), where \(y = x\).

In this case, \(b\) is still a constant. Indeed, it doesn’t really make sense for our initial value to be variable *dependent on* the input value, since our initial value, by definition, is the value of the function when the input value is 0.

What is the derivative of \(y\cdot x + b\)? Deriving implicitly gives us \(y \cdot x’ + y’ \cdot x + 0\). But since \(y = x\),* *we know \(y’ = x’\) and so \(y\cdot x’ + y’\cdot x + 0 = 2\cdot x \cdot x’ = 2x\).

In other words, if we see \(y = x^2\) as a function whose rate of change varies directly with the input value, we get the same derivative that the Power Rule gives us.

Naturally, since this explanation relies on implicit differentiation, which is a topic that is usually covered well after standard differentiation is, it’s not particular useful as a direct demonstration of the Power Rule. However, I do think it’s useful for reinforcing this idea of rate of change.

We can then expand this: If \(m = x^2\), then we derive \(x^2 \cdot x\) as \(2x \cdot x + x^2 \cdot 1 = 3x^2\), and so on. A more rigorous proof involves showing that if the derivative of \(x^k = kx^{k-1}\), then the derivative of \(x^{k+1} = (k+1)x^k\), that is: \[\begin{align}\text{Given:}& \\&(x^k)’ = kx^{k-1}\\\text{Then:}& \\&(x^{k+1})’ = (x^kx)’ = (x^k)’x + x^kx’\\&=(kx^{k-1})x + x^k = kx^k + x^k\\&=(k+1)x^k\end{align}\]

So the Power Rule holds for at least non-negative integers.

Again, I’m not holding this as an original concept at all, just something I noticed and that helped me understand the concept of “rate of change” that much better.