An interesting question on the G+ Mathematics community I co-moderate asked about the difference between “numbers” and “numerals”. We wound up discussing this at a party I was hosting (which shows the sort of nerds we are), and this post is born from those discussions and my further thoughts.
It seems to me we have at least four overlapping but distinct terms when it comes to matters of numbers: Number, value, numeral, and digit. I’ll deal with each of these in turn.
The one I feel most confident in defining is “number”. A number is any predictable symbolic representation of a specific value. This means, of course, coming back to what “value” means, but let’s look at the rest first.
My definition makes “number” a quasilinguistic unit rather than a mathematical one. That is to say, a number is an effective way to communicate a value from one person to another, just as a word is an effective way to communicate an idea.
Here, then, are some numbers: 3, 4.512, four thousand and five, sechs und siebzig, soixante-dix, 四, π, XIV, (4·5)+2, 8A4, 5!
Note that some of these examples require some knowledge beyond simply the ability to map a symbol onto a value. There are also some ambiguities. For instance, we generally assume that numbers are written in base 10 unless explicitly told otherwise. There are computer programming documents, however, where the default base for the scope of the article is binary, hexadecimal, or (I presume) octal: Someone who’s not aware of this may either misinterpret values (1001 as “one thousand and one” instead of as “nine”) or be completely confused (“what does 8A mean?”).
Also, people who are not familiar with specific languages won’t know how to translate number words in those languages; “sechs und siebzig” won’t be understandable to someone who doesn’t speak German, or at least a language close enough to it to make a reasoned guess.
Likewise, number symbols are only useful to people who know the systems. We tend to think of “3” as being language-independent, by which we mean it doesn’t matter what language we speak. While that’s true, it’s still important to know the underlying Arabic numeral symbol system; 四 is as much a number symbol as 3 is, but it’s only known to a comparatively small portion of the population. XIV is only meaningful to people who understand the Roman numeral system, and in fact, for much of the Roman era, it wouldn’t have made sense (the convention of writing 4 as IV instead of as IIII was moderately late).
The concept of value strikes me as fairly unambiguous as well, but it’s harder to explain. The analogy to natural language also fails us a bit. The word-concept relationship is based on cultural notions of semantic prototypes: We break the world’s objects down into clusters, then name those clusters. For instance, the English word “dog” and the German word “Hund” largely refer to the same set of objects in the real world, but there are differences. Plato might have argued that there’s some philosophically ideal “dog”, but I’m not so convinced.
While words are attached to mental concepts of real-world object grouping that are generally socially agreed on, numbers really are attached to ideal values of the sort associated with Platonic thought. The value of “4” (integer base 5 or higher) is unchanging. We can represent it as real-world objects: o o o o. That group of objects has “4-ness”. There is no unit attached to ideal values; another way to state this is that the unit of ideal values is the null unit. We can do mathematics in the abstract without units because we can move into this ideal realm; if we later reattached units, we must make sure to be consistent.
I find myself struggling the most with the distinction between “numeral” and “digit”. Continuing the analogy to natural language writing, it seems that numerals are the equivalent of letters. This seems straightforward for people use to working within the system most of us are used to working with, the Arabic numerals: Each character represents a certain value determined by multiplying its face value with its place. For instance, in 35, the 3 indicates a value of thirty, while the 5 indicates a value of five.
However, this is not the only system used for writing numbers. In the Roman numeral system, each character represents the same value; there’s no place system because 10 (X) is a completely different symbol from 1 (I), and 100 (C) and 1000 (M) are unique symbols beyond that. Indeed, in the history of mathematics, the development of the place system was a major innovation because it greatly simplified basic mathematical calculations.
This does raise the question: What is the set of Roman numerals? Clearly it includes I, V, X, L, C, D, and M. When I was discussing this with friends, the sentiment appeared to be that other number values are also included: II, III, IV, and so on. I would argue that this is not the case, and that the seven basic letters are the complete set of Roman numerals. When we combine letters, we only argue that the creations might make new letters in the case that the parsing changes; “th” represents a different sound in English than either “t” or “h” independently. Even in those cases, though, we don’t in fact consider them new letters in English. Spanish experts also recently demoted its digraphs (rr, e.g.) to pairs of letters, so that the current “official” Spanish alphabet has 27 letters (English’s 26 plus ñ). The Roman number IV does not have a different meaning than I and V separately; quite the contrary, its meaning is determined from the meaning of the individual pieces.
The number system used by the Chinese (and hence the Japanese and the Koreans, in some contexts) uses characteristics of both of these systems: There is a numeral 0..9 set, and there are placeholder symbols for powers of 10. This means that the symbol for zero is only used for the value of zero, because it’s not otherwise needed: 50,402 is written as 五萬四百二, that is, “five ten-thousands, four hundreds, and two”. An advantage of the Chinese system (and the Roman system) can be seen in writing large powers of 10 quickly: 5,000,000 is 五百萬. A disadvantage is in writing any large number that is not a power of 10: 123,456,789 is 億二千三百四十五萬六千七百八十九.
I would argue that the set of Chinese numerals consists of the value symbols (零〇一二三四五六七八九)* combined with the place holder symbols (十百千萬億兆).
This means that the Chinese, Roman, and Arabic numerals represent different things from each other (Arabic 2 indicates a value of 2 · 10n, where n is determined by how many other numerals come after [or before, in the case of decimals], Roman II consists of two I numerals, each meaning a single unit, and Chinese 二 indicates a value of 2 · 10n, where n is whatever place holder characters are provided [if any]). This isn’t particularly problematic, though, if we think of the variety of what written characters can mean with regards to natural language; letters in alphabetic scripts can represent single sounds, they can combine with neighbor to create other sounds (as in <th>), they can have accent marks to augment their pronunciations (ç), and so on. And of course not all written scripts are alphabetic; in Chinese writing, each character represents a syllable or a semantic kernel, while in Korean, phonemes are clustered into groupings that represent syllables.
* 零 is the traditional Chinese character, while 〇 is used in Korea and Japan.
The concept I find myself struggling with the most is that of “digit”. Are digits and numerals synonyms? If not, what’s the difference?
Let’s start with numbers expressed in Arabic numerals. 8,330 is normally called a 4-digit number, and it consists of four numerals. That would suggest that “digit” and “numeral” are effectively synonyms. I feel, though, that “digit” tends to refer to the symbols in situ, while “numeral” refers to the characters themselves.
Now let’s consider Roman depictions. The number CLXXIII uses seven numerals, including four different ones. How many digits is it? The etymology of “digit” is that of a finger (or toe), and so it’s defensible to argue that a digit is a representation of a specific place value. CLXXIII expressed in Arabic numerals is three digits: 173. If we were going to communicate that using our hands, we’d most likely use three gestures: 1, then 7, then 3. Using our digits (fingers).
Furthermore, despite their use of what now seems to us to be a clumsy mathematical symbol system, the Romans did have a concept of place value: They invented the abacus, although the Chinese added rods (instead of troughs) to make it much more efficient to use. So it seems to me to be somewhat dismissive to refer to CLXXIII having seven digits when the Romans understood that it represented three values.
With regards to Chinese, except for values from 0 to 9 and multiples of 10, numbers are expressed in terms of value-place pairs. We would call 100 a three-digit number; in Chinese (as in the Roman system), it’s a single character (百). We would call 234 a three-digit number; Chinese uses five characters, 二百三十四 (while Medieval Roman uses seven: CCXXXIV).
The key question: Does digit refer to the number of place values, or the number of numeral characters used?
In our system, those are the same counts. In Roman and Chinese, though, they’re not. I’m inclined to think, based on the aforementioned etymology of “digit” as well as the knowledge in both Rome and China of the abacus (and hence an understanding of place values), that “digit” refers to the number of place values, regardless of the number of characters used in its depiction. Hence, 百 (C) and 二百三十四 (CCXXXIV) are both three-digit numbers.
I’m curious what others think. Please do comment.