I’ve been seeing a few videos lately explaining why 0!=1. Over my time as a teacher, I’ve seen a lot of various explanations for this equality, and they generally fall into these categories:
- It’s an arbitrary definition that we need to make math work.
- It comes from the combinatoric definition of factorial.
- It’s the result of the Gamma function.
What I don’t recall seeing is a complete explanation from the perspective of set theory, so I’ll provide that in this article.
It’s natural that a combinatoric explanation would be popular: That’s how the factorial was initially explored in detail. Specifically, the gist is this: \(n!\) represents the number of ways in which we can arrange \(n\) objects. If we have three objects, A, B, and C, we can arrange them six ways: ABC, ACB, BAC, BCA, CAB, CBA. \(3!=6\), so there we go.
The conclusion of this argument is that, if we have no objects, there’s only one way to arrange them, so \(0!=1\). The problem with this argument is that it then gets bogged down in philosophy: If we have no objects, we have nothing to arrange at all, so we’re stuck. It’s nonsensical to say there’s one way to do something we can’t do in the first place.
On one level, I understand the argument that there’s one way to arrange nothing. On another, though, I understand why people balk at that entire notion.
Also, a problem with this “How many ways can you do something you can’t do?” explanation is that it doesn’t cleanly explain why \((-1)!\) is undefined. I can’t arrange negative one items, either.
What each of the categories I listed above have in common is that they are specific to factorial. But there’s another method of explaining this that relies on more general concepts. Most of the time in mathematics, an explanation that relies on general concepts is better than a context-specific one.
Identities and Set Theory
Let’s define some concepts.
First, an operational identity is a value that, when used with an operator, doesn’t change the total result. The two primary identities that most people learn about (although maybe not with this language) are the additive identity (0) and the multiplicative identity (1). For instance, \(4 + 0 = 4\) and \(4 \times 1 = 4\): Adding (or subtracting) zero doesn’t change the value; likewise for multiplying (or dividing) one.
Next, a collection of values is any grouping of individual values. For instance, \(A = (3, 4, 7, 9)\) might be the ages of four children in a household; \(B = (5, 5, 9, 12)\) might be another such collection. We could then ask, “What is the total of the ages of all the children in these households?” which would give us 23 for \(A\) and 31 for \(B\).
I’m using the word “collection” here because a set is a collection with no repeated values. But the operations I discuss below on collections also apply to sets.
For the purpose of this discussion, let’s define ⊕ as the sum of all the values in a collection and ⊗ as the product (this is not the typical definition of these symbols). In the examples above, \(⊕A = 23, ⊕B = 31, ⊗A = 756, ⊗B = 2700\).
What if there’s nothing in the collection? Perhaps there are no children living in household \(C\).
Since we’re applying an operator to all the elements of a collection, it makes sense that we would use the values of the corresponding identity as our starting point. After all, if a child moves out of household \(B\), we would subtract their age from \(⊕B\) and divide it from \(⊗B\).
Let’s say that \(C\) had a single child, who was 12. In this case, \(⊕C=⊗C=12\). They move out, and now \(⊕C = 12-12 = 0\) and \(⊗C = 12/12 = 1\).
An empty collection is call the null set, typically depicted as \(\emptyset\). So \(⊕\emptyset = 0\), the additive identity, and \(⊗\emptyset=1\), the multiplicative identity. This is often said to be “by convention”, but as I’ve just shown, this is not an arbitrary decision; it follows from the nature of these collection operators.
Importantly, I have not mentioned factorials in this section.
The Set Theory Definition of Factorials
Let \(N\) be the set of the first \(n\) positive integers.
Then \(n!=⊗N\).
That’s it. That’s the whole definition, and it accounts for both \(0! = 1\) (when \(n=0, ⊗N=⊗\emptyset=1\)) and why \(n!\) is undefined for negative values (you can have an empty set, but you can’t have a set with a negative number of elements).
This explanation is contained in one of the common algorithmic explanations: If we know \(n!\), we can find \((n-1)!\) by dividing by \(n\): \(5! = 6!/5 = 120\), etc., so \(0! = 1!/1 = 1\) but \((-1)! = 0!/0\), which is undefined.
But that explanation makes me slightly uneasy because it defines factorial going up, then argues for \(0!\) going down. By the time we know what \(6!\) is, we’ve already established \(5!\). It’s not a full-fledged act of begging the question, but it’s in that general area.
I greatly prefer seeing factorial as the product of consecutive positive integers, and this definition relies on concepts that we can establish (and hence use) independent of the factorial itself.