Due to a disk crash and backup failure, this site has been restored from an old backup with a number of more recent articles missing. The missing site content is being restored as time permits. We apologise for any inconvenience.
Nutters.org The Nutter Log
Monkeys and the Lottery Entry id: monkey-lottery
By The Famous Brett Watson
On Fri, 21 Jun 2002 02:49:00 +1000

Hello folks. One way or another, my heavy university committments are nearly over. Thus, I hope to be better placed to write more nutty stuff in the near future. Meanwhile, here's another of those "I had it lying around anyhow" type of log entries.

Recently I was sent an email message thanking me for The Mathematics of Monkeys and Shakespeare, and asking me to adjudicate on a matter involving probability. The matter on which I was called to judge was described as follows.

One of my collegue is a programmer, he is arguing with the rest of us that if he generates a random string of 32 characters, it can never be reproduced anywhere in the world. What I think is that chances are never null, there is always a chance. What do you think?

My correspondent is here making a distinction between "highly improbable" and "absolutely impossible". This is a fair distinction to make, but it's also a trap for the unwary. Just because something is not absolutely impossible doesn't mean it will happen in this finite universe of ours. Cryptographers rely on this very fact when they encrypt data with a 128-bit key, for example. It is not absolutely impossible to decrypt something that has been encrypted with a 128-bit key — that would defeat the whole purpose of encrypting in the first place — but if the best available way to decrypt the data without knowledge of the key is to make guesses at the key, then a cryptographer will consider the system unbreakable.

But if people in general understood all that cryptography mumbo-jumbo, there'd be no need for me to talk about monkeys, typewriters, and millions or billions of years. Even with all the monkeys and typewriters story-telling, the point — that sufficiently improbable things are effectively impossible to produce without intelligent input — is still often missed. I need another analogy — one that people are going to relate to on a much more intuitive level.

With those considerations in mind, I wrote the following response.

Thanks for your comments. In short, I agree with your programmer colleague, but we must make a distinction about "can never" and "will never, with a high degree of certainty". Let's do some maths.

Let's assume that the character set we're dealing with has 64 characters. That's enough to get upper and lower case English letters, the digits from 0 to 9, and two other characters. Slightly limited relative to what I'm typing here, but similar. How many different strings containing 32 characters are there? You get the answer by multiplying 64 by 64 by 64, etc, 32 times.

  64 * 64 * 64 * ... * 64 (32 times)
= 64^32 (64 to-the-power-of 32)
= 6277101735386680763835789423207666416102355444464034512896

That number is 58 digits long. That's a big number. BUT, if I could generate ALL of those strings, then I can GUARANTEE you that one of my strings is the same as his. The trick is, of course, that I can't generate all those strings, because there isn't enough time and matter in the universe to do the task. Instead, I'm going to have to generate SOME of the strings, and hope for the best.

If I can generate 1% of the strings, then I have a 1% chance that his string is the same as one of mine. So the question is simply, "how many strings can I generate?" Let's put my monkeys on the job. They can generate one string each per second. Let's give them a million years in which to work. Our challenge is to reproduce it somewhere on Earth, and I understand there's approximately 150,000,000,000,000 square metres of land space available on the planet. Let's give each monkey one square metre for his work, so that's also how many monkeys we have. I'll arrange it so that they never type the same thing twice, so none of our output is wasted.

One million years and a planetful of monkeys and typewriters later, we've produced somewhere on the order of 4.73364e+27 strings. Compare that to the total number of possible strings, which was about 6.27710e+57. What fraction of the possible strings have we produced? Approximately 7.54112e-31. What's that as a percentage? I'll show you.

  0.0000000000000000000000000000754112%

That's the chances that one of my monkeys produced the string we wanted.

This is where the numbers end and you have to start arguing by analogy. That number is not zero, so there is still a VERY SMALL CHANCE that we got the string we wanted. But how small is very small?

Here in my home state of New South Wales, Australia, the government is so kind as to engage in on-going research into probability, funded VOLUNTARILY by the populace. Isn't that nice? They call it "Lotto". Sometimes people win a lot of money as a part of this research. If you can correctly choose six numbers from a selection of forty-four numbers, you win the First Division Prize, which could easily be a million dollars or more. Unfortunately, the chances of picking those numbers correctly are slightly worse than seven million to one against. (The maths here involves "combinatorics", which would take a while to explain if you aren't already familiar with it.)

So, I imagine that you are already familiar with this kind of lottery, yes? Let's compare the chances of guessing the right string with winning the lottery. Let's say I decide to play Lotto every Wednesday, so I go out and buy my first ticket. By an incredible stroke of luck, I win the First Division Prize! Encouraged by this result, I buy another ticket next week, and I win the First Division Prize again! This seems to be really easy money, so I buy another one next week and win the First Division Prize again. By this time, half of the population is out to lynch me because they are sure I'm cheating.

Some people win big on Lotto. I don't think I've ever heard of anyone winning the First Division Prize more than once, though, and I'm QUITE sure that I've never heard of anyone winning it TWICE IN A ROW. The chances of my trillions of monkeys producing the right answer are like the chances of winning the First Division Prize in Lotto FOUR TO FIVE TIMES IN A ROW. The chances of guessing the right string in only one attempt are about like winning the First Division Prize in Lotto EIGHT TO NINE TIMES IN A ROW.

Does that help put things in perspective? If you think you can guess the right string, then guess lottery numbers instead! It's much more financially rewarding!

So what do you all think? Should I adopt "Consecutive Lotto First Division Prize Wins" as the new Nutter probability metric?

Public Domain: the author waives copyright on this log entry. Other sources (if any) are quoted with permission or on the principle of "fair dealing" and retain their original copyrights.