Math: Estimating Roots

A few years ago, I developed a quick algorithm for approximating square roots. I’ve since come across a more effective, albeit slightly more complicated, algorithm. Both of these are meant to create quick approximations in an era where calculators can come up with more precise values far quicker than humans.

My first method

If you’re trying to take the square root of a radicand that is not a perfect square, the value will be an irrational number between two integers. These two integers correspond to the closest perfect squares in either direction, and the irrational values that are the square roots between those integers are distributed between them. For instance, here are the square roots of all integers between 16 and 25:

Casually, these appear to be evenly distributed, but they’re slightly skewed to the right. Nonetheless, we can use this apparently even distribution to estimate the values: There are nine gaps between the integers, so approximate each value based on dividing its corresponding distance from 16 by 9. For instance, $\sqrt{18}\approx4\frac29$ .

Here is a comparison of the approximations using this method to the actual values:

The underlying algorithm is:

Find the two closest perfect squares to the target integer radicand $r$ . Call these $m$ and $n$ .
Let $M=\sqrt{m}$ .
The approximate root is $M+\frac{r-m}{n-m}$ .

For example, estimate $\sqrt{125}$ :

$m=121$ ; $n=144$
$M=11$
$\sqrt{125}\approx11\frac{125-121}{144-121}=11\frac{4}{23}$

The calculator value is $\sqrt{125}\approx11.1803398875$ , compared to the algorithm value of $11.1739130435$ . A decent estimate given the minimal effort.

A drawback of this method is that it only works decently for square roots. Here is the distribution of cube roots from 8 to 27:

The rightward skew is much more obvious, and with higher powers it gets even more so.

A More Refined Approach

This explanation will involve calculus, but first, here’s the algorithm. To approximate $\sqrt[n]{r}$ :

Find the closest perfect power to the target integer radicand $r$ . Call this $m$ . (Note that $m$ can be greater or less than $r$ .)
Let $M=\sqrt[n]{m}$ .
The approximate root is $\sqrt[n]{r}\approx M+\frac{r-m}{n\cdot M^{n-1}}$ .

For example, estimate $\sqrt{125}$ :

$m=121$ (because 125 is closer to 121 than to 144)
$M=11$
$\sqrt{121}\approx 11+\frac{125-121}{2\cdot11^1}=11+\frac{4}{22}=11\frac{2}{11}$

This gives a calculator value of $11.1818181818$ , a better approximation.

This method particularly shines for root of higher powers. For example, estimate $\sqrt[3]{135}$ :

$m=125$ (because 135 is closer to 125 than to 216)
$M=5$
$\sqrt[3]{135}\approx 5+\frac{135-125}{3\cdot5^2}=5+\frac{10}{75}=5\frac{2}{15}$

The calculator value is $\sqrt[3]{135}\approx5.12992784003$ , compared to the algorithm value of $5.133333333333$ .

Here’s a case where we chose a higher integer. Estimate $\sqrt[3]{175}$ :

$m=216$ (because 175 is closer to 216 than to 125)
$M=6$
$\sqrt[3]{175}\approx 6+\frac{175-216}{3\cdot6^2}=6-\frac{41}{108}=5\frac{67}{108}$

The calculator value is $\sqrt[3]{175}\approx5.59344471041$ , compared to the algorithm value of $5.62037037037$ .

So why does this work?

What we’re doing is using a linear approximation based on the line tangent to the inverse function at the nearest integer.

Oof.

In clearer terms: For positive radicands, $\sqrt[n]{x}$ is the inverse of $x^n$ . The steepness of the graph of $x^n$ at any point is equal to $n\cdot x^{n-1}$ . This means that if two values, $m$ and $r$ , are close to each other, and we know the value of $m^n$ , we can approximate the value of $r^n$ by using a line with a slope of $n\cdot m^{n-1}$ that contains the point $(m, m^n)$ .

Likewise, we can do the opposite. If we happen to know the value of $r^n$ and want to approximate the value of $r$ , we can use that same line to do so. We could calculate the distance from $m^n$ to $r^n$ , divide that distance by the slope at $m$ , and add that to the starting value ( $m$ ).

In calculus terms, our approximation, where $f(x)=x^n$ and $R=\sqrt[n]{r}$ , is: $R\approx M+\frac{f(R)-f(M)}{f'(M)}$

Which algorithm to use?

Honestly, if the point is to get a precise value, use a calculator. I generally advise my students that the basic purpose of these algorithms is to be able to build a number sense and to have a decent approximation when a calculator isn’t at hand or if you want to double check work.

For square roots, I still think my original algorithm is more straightforward, especially as to “why it works”. For higher roots, the second algorithm is definitely better.

As for error, the first algorithm consistently underestimates the value, something else that’s straightforward to explain because of the rightward skewing on the number line.

The more advanced algorithm’s error is consistently slightly too high. Because root graphs (for positive integers indices above 1) are increasing and concave downward, all tangent lines will be above the graph except at the point of intersection.

If nothing else, these explorations allow us to see how numbers interrelate. Mathematics shouldn’t just be about number-crunching.

Addendum: Technically…

The second algorithm chooses the closest integer root, but close to the halfway point, the “better” choice is the higher integer. For instance, for the integers in (64, 125), estimating the cube root of each value, the algorithm calls for using $m=64$ for $r\le94$ and $m=125$ for $r\gt94$ , but $m=125$ is the better choice for $r\ge89$ . However, the method for calculating this tipping point involves a bunch of algebra that, while not difficult, is time-consuming and not worth the effort for the minimal gain in precision, especially since this is about approximating.

In this case, using $M=4$ or $M=5$ gives us two candidates: $4+\frac{x-64}{3\cdot4^2}$ and $5+\frac{x-125}{3\cdot5^2}$ . Setting these equal to each other and solving for $x$ yields $x=88\frac89$ as where those two tangent lines intersect. To the left of that point, $M=4$ provides a better estimate; otherwise, $M=5$ provides a better estimate. However, that’s a lot of work for a minimal yield. I’m providing it here to pre-empt pedantic criticism.

My first method

A More Refined Approach

So why does this work?

Which algorithm to use?

Addendum: Technically…

Share this:

Leave a Reply Cancel reply