Background
There are two common ways for finding roots of quadratic equations, that is, equations of the form \[ax^2 + bx + c = 0\]
The one that’s usually taught first is a shortcut that works best when \(a = 1\) and two factors of \(c\) have a sum of \(b\): In fact, that’s exactly how it’s taught.
For instance, in \(x^2 + 5x + 4\), 4 has the factors 1 and 4, which add to 5; the quadratic can then be factored: \(x^2 + 5x + 4 = (x + 4)(x + 1)\), and so the expression has roots at \(x = -4\) and \(x = -1\). (Graphed on WolframAlpha)
Meanwhile, in \(x^2 + 4x + 4\), 4 has the factors 2 and 2, which add to 4. The quadratic can be factors as \((x + 2)(x + 2)\), and its roots are \(x = -2\). (Graphed on WolframAlpha)
However, this doesn’t work well or at all in cases where \(c\) doesn’t have integer factors that sum to \(b\), or where \(a\) is not 1. It can be modified, but at some point it becomes easier to use the general solution, which is using the Quadratic Formula: \[\text{Given } ax^2 + bx + c = 0\text{, then} \\ x = \frac{-b \pm \sqrt{b^2 – 4ac}}{2a} \]
I believe mathematics teachers tend to delay the Quadratic Formula because it looks complicated. The “shortcut” method involves trial and error and only works with a very limited subset of quadratics, but mathematics books are filled with deliberately constructed quadratic equations where it does work. So fair enough.
However, once the Quadratic Formula is introduced, it may be confusing on another level: It’s not immediately clear why the relatively complex-looking Quadratic Formula and the simple “find two numbers that multiply to C and add to B” method should return the same values. It’s an excellent opportunity for a teacher to illustrate that two different methods can return the same result. In this post, I’d like to show why it works.
Rewriting the Quadratic Formula
Naturally, we could show how the Quadratic Formula was originally derived, but we don’t actually need to go that route. Instead, let’s think about what the formula means. The purpose of the formula is to find two binomial factors of a quadratic expression. Each of these factors will involve \(x\) multiplied by some value, with the product added to some other value. In mathematical terms, \[ ax^2 + bx + c = (dx + e)(fx + g) \]
We’re going to rewrite the Quadratic Formula in terms of these new parameters: \(d\), \(e\), \(f\), and \(g\).
First, expand the right side: \[(dx + e)(fx + g) = dxfx + dxg + efx + eg \\ = dfx^2 + (dg + ef)x + eg\]
We can then match each of the up with coefficient parameters from the general quadratic expression: \[a = df \\ b = dg + ef \\ c = eg\]
Let’s take the original Quadratic Formula and use these new substitutions: \[x = \frac{-b \pm \sqrt{b^2 – 4ac}}{2a} \\ = \frac{-(dg + ef) \pm \sqrt{(dg + ef)^2 – 4dfeg}}{2df} | \text{ Substitute parameters} \\ = \frac{-(dg + ef) \pm \sqrt{(dg^2 + 2dgef + ef^2) – 4dfeg}}{2df} | \text{ Expand} \\ = \frac{-(dg + ef) \pm \sqrt{dg^2 – 2dfeg + ef^2}}{2df} | \text{ Combine} \\ = \frac{-(dg + ef) \pm \sqrt{(dg – ef)^2}}{2df} | \text{ Contract} \\ = \frac{-(dg + ef) \pm (dg – ef)}{2df} | \text{ Simplify} \]
At this point, we want to look at the two cases represented by the \(\pm\) sign: \(x = \frac{-(dg + ef) + (dg – ef)}{2df}\) and \(x = \frac{-(dg + ef) – (dg – ef)}{2df}\). Take the first one: \[x = \frac{-(dg + ef) + (dg – ef)}{2df} \\ = \frac{- dg – ef + dg – ef}{2df} \\ = \frac{-2ef}{2df} = -\frac{e}{d} \] The second one returns a similar result: \[x = \frac{-(dg + ef) – (dg – ef)}{2df} \\ = \frac{- dg – ef – dg + ef}{2df} \\ = \frac{-2dg}{2df} = -\frac{g}{f} \]
So the Quadratic Formula gives us two roots: \(x = -\frac{e}{d}\) and \(x = -\frac{g}{f}\). This is to say, either \(x + \frac{e}{d} = 0\) or \(x + \frac{g}{f} = 0\). These should look familiar, because if \(x + \frac{e}{d} = 0\) then \(dx + e = 0\), which is one of the two factors at the beginning of this section.
The Quadratic Formula and the special case shortcut
So now that we’ve written the roots of a quadratic expression in terms of different parameters, how does this align with the special case?
The short cut works best when \(a = 1\). Since \(a = df\), \(d\) is the multiplicative inverse of \(f\); for the sake of simplicity and illustration, let us use \(d = f = 1\). This leads to \(b = g +e\) and \(c = eg\), with the roots \(x = -e\) and \(x = -g\)… which is the shortcut, precisely.
Awesome explanation. Cheers.