Due to a disk crash and backup failure, this site has been restored from an old backup with a number of more recent articles missing. The missing site content is being restored as time permits. We apologise for any inconvenience.
Nutters.org The Nutter Log
Proverbial Probability Problems Entry id: profound-combinations
By The Famous Brett Watson
On Fri, 24 Jan 2003 05:03:00 +1100

Greetings, folks! Today, I'd like to share with you a little about epistemology and probability — mostly the latter, but apropos of the former. I originally started writing this item about six months ago upon receiving a curious rant via email from a party using the nom de plume "Ryan and Jacob". At first, I couldn't figure out whether the thing was a spam or not, since it was not selling any of the usual dubious merchandise endemic to spam; rather, the content was very much epistemological, with passing reference to probability. If this was spam, they'd managed to target it much better than any of the twits trying to sell me Viagra pills. A quick web-search for "Ryan and Jacob" will affirm that the item in question was spammed, and I eventually filed it under "Mail Abuse", rather than "Nutters".

It seems that Ryan and Jacob had just discovered Epistemology, badly. They'd gone and recognised that so much of what we consider "knowledge" can't really be proved as such. Sadly, they then assumed that they were the only people in the world awake to this fact (and thus somewhat enlightened relative to the rest of us) and embarked on a quest to find like-minded individuals who would affirm that which they'd already decided was true. Hopefully, they'll grow out of it, although I have reason to believe these guys are now in their early-to-mid twenties, and I feel one should have already gone through that phase by that stage, if one is going to go through it at all.

But on to probability. In a pre-emptive strike against those who would say, "I've heard this all before", Ryan and Jacob's rant contained the following little snippet.

I don't really care if anything I say has been said before, if it was portrayed in movies, in books, or in the lyrics of some useless song. With 6 billion people covering the globe at any given time, thousands and thousands of years of written literature, probability dictates almost any combination of words has occurred numerous times.

"Probability dictates", you say? What a nice, juicy, measurable claim. Let us engage in a little mathematics to examine it.

To assist us in this measurement, I will engage the services of a silly little computer program that I wrote some years ago called "profound". It generated random proverbs of the form "X is the Y of the Z", where X, Y, and Z were chosen from three fixed sets of nouns or adjective-noun pairs, containing 21, 16, and 22 elements respectively. Thus the program was capable of producing 21*16*22, or 7392, proverbs. For example, one such proverb is, "meticulousness is the hallmark of the resourceful." Some of the combinations were silly, some of them were a bit odd, some of them were downright amusing, and some were curiously profound. If nothing else, their simple structure meant that they were all grammatically correct, even if they stated something ridiculous.

Let's imagine that I take these 7392 proverbs and use them to create a collection of essays, where each essay consists of ten proverbs. My collection of essays is going to include every possible combination of ten proverbs, whilst observing two special rules: I'm not going to use the same proverb more than once in any essay, and I'm not going to allow any essay to be a simple re-ordering of any other essay. All the essays are therefore unique, in more ways than one.

This turns out to be one of those monkey-math situations which produces a ridiculously large number. The mathematical operation used to compute the number of essays, given the above description, is called "combination". To cut a long explanation short, the number of essays in my imaginary library would be 133,417,124,980,308,800,794,439,887,058,448. That's rather a lot. Six billion people covering the globe for thousands and thousands of years wouldn't be able to write an appreciable fraction of these. Give them a billion years and the ability to produce one essay per second, and they'd still only produce about 0.00000142% of the total library.

At this point, you could argue that it's still only 7392 proverbs, no matter how many ways I paste them together. That's true: in my scenario, each individual proverb occurs an enormous number of times, but each essay, viewed as a combination of words, is unique. The general truism that we can glean from this is that any sufficiently small combination of words is likely to have occurred numerous times, and any sufficiently large combination of words is not likely to occur at all. In this text, for example, it's highly unlikely that any adjacent pair of words is unique in all the world, but many of the sentences and most of the paragraphs will be.

Yes, folks, there's still room in the universe for original authors! Ryan and Jacob's letter, for example, is almost certainly a unique work, when viewed as a whole. Originality doesn't equate to quality, however, and their grasp of epistemology is as poor as their grasp of probability.

Public Domain: the author waives copyright on this log entry. Other sources (if any) are quoted with permission or on the principle of "fair dealing" and retain their original copyrights.