# Pseudorandom number generators | Computer Science | Khan Academy

(fun music) (leaves rustling) – [Voiceover] When we

observe the physical world we find random fluctuations everywhere. We can generate truly random numbers by measuring random

fluctuations, known as noise. When we measure this

noise, known as sampling, we can obtain numbers. – [Voiceover] One, two, three, four– – [Voiceover] For example, if we measure the electric current

of TV static over time, we will generate a truly random sequence. We can visualize this random sequence by drawing a path that changes direction according to each number,

known as a random walk. Notice the lack of pattern at all scales. At each point in the sequence the next move is always unpredictable. Random processes are said

to be nondeterministic, since they are impossible

to determine in advance. Machines, on the other

hand, are deterministic. Their operation is

predictable and repeatable. In 1946, John von Neumann was involved in running computations for the military; specifically involved in the

design of the hydrogen bomb. Using a computer called the ENIAC, he planned to repeatedly

calculate approximations of the processes involved

in nuclear fission. However, this required quick access to randomly generated numbers that could be repeated, if needed. However, the ENIAC had very

limited internal memory; storing long random

sequences was not possible. So, Neumann developed an algorithm to mechanically simulate

the scrambling aspect of randomness as follows: First, select a truly random

number, called the “seed”. This number could come from

the measurement of noise, or the current time in milliseconds. Next, this seed is provided as input to a simple calculation. Multiply the seed by itself, and then output the middle of this result. Then you use this output as the next seed, and repeat the process

as many times as needed. This is known as the middle-squares method and is just the first in a long line of pseudorandom number generators. The randomness of the

sequence is dependent on the randomness of the initial seed only. Same seed, same sequence. So, what are the differences between a randomly generated versus pseudorandomly generated sequence? Let’s represent each

sequence as a random walk. They seem similar until

we speed things up. The pseudorandom sequence

must eventually repeat. This occurs when the

algorithm reaches a seed it has previously used,

and the cycle repeats. The length, before a

pseudorandom sequence repeats, is called “the period”. The period is strictly limited by the length of the initial seed. For example, if we use a two-digit seed, then an algorithm can produce, at most, 100 numbers, before reusing a seed and repeating the cycle. A three-digit seed can’t

expand past 1,000 numbers before repeating its cycle, and a four-digit seed can’t expand past 10,000 numbers before repeating. Though if we use a seed large enough, the sequence can expand into trillions and trillions

of digits before repeating. Though the key difference is important. When you generate numbers pseudorandomly, there are many sequences

which cannot occur. For example, if Alice generates

a truly random sequence of 20 shifts, it’s equivalent

to a uniform selection from the stack of all

possible sequences of shifts. This stack contains 26

to the power of 20 pages, which is astronomical in size. If we stood at the bottom

and shined a light upwards, a person at the top

would not see the light for around 200,000,000 years. Compare this to Alice generating a 20 digit pseudorandom sequence, using a four-digit random seed. Now, this is equivalent

to a uniform selection from 10,000 possible initial seeds, meaning she can only generate 10,000 different sequences, which is a vanishingly small fraction of all possible sequences. When we move from random

to pseudorandom shifts, we shrink the key space into a much, much smaller seed-space. So, for a pseudorandom sequence

to be indistinguishable from a randomly generated sequence, it must be impractical for a computer to try all seeds and look for a match. This leads to an important distinction in computer science,

between what is possible, versus what is possible in

a reasonable amount of time. We use the same logic

when we buy a bike lock. We know that anybody can simply try all possible combinations, until they find a match and it opens. But it would take them days to do so. So, for eight hours we

assume it’s practically safe. With pseudorandom generators, the security increases as the

length of the seed increases. If the most powerful computer would take hundreds of years to

run through all seeds, then we safely can assume

it’s practically secure, instead of perfectly secure. As computers get faster the seed size must increase accordingly. Pseudorandomness frees Alice and Bob from having to share their entire random shift sequence in advance. Instead, they share a

relatively short random seed, and expand it into the same random-looking sequence when needed. But what happens if they can never meet to share this random seed?

in cryptographically secure pseudo-random number generators such as the /dev/random and /dev/urandom on Unix machines, it constantly takes up new seeds to prevent anyone known the state of the generator at any one time fully. It usually uses stuff like block ciphers and hashes to take essentially random input like keystrokes, and hardware interrupt timing to continually shuffle the output to keep it as secure as possible for generating keys and stuff for encryption such as using a https connection. For CSPRNGs it is required to pass tests for statistical randomness as you wouldn't want to use a encryption key even for a one time pad that was a very simple number that wasn't random at all.

Quantum computers are coming, we're damned

I got here from a book on C programming

I used 41 and got 4116814624384470560250062538447056025006253844etc

Although I ended up with a repeating sequence, I did not encounter the seed of 41 repeating itself. Did I do something wrong?

I felt like I watched those 1960s tutorial video. lol

the music in the background was so scary

While word size dependence is true for linear congruential generators, there are plenty of random number generators whose period is independent of the word size; as an example, the mersenne generator or R250. What you are saying is accurate but it comes across as if you are saying that LCG is the only algorithm.

4:12 what??? Shift? Pages? This part of the video made absolutely no sense.

For every, I don't know, 10,000 calculations, why don't we start taking MORE than just the middle 3 digits and make THAT be the new seed??? Problem solved, no?

Great video explaining random and pseudo random number generation.

plzz stop the background sound

Great Video! I got here from Angela's Course.

Publishers clearing house loads numbers in to a generator to pick winners

Please wat is this

it is on khan acadamey, too:https://www.khanacademy.org/computing/computer-science/cryptography/crypt/v/random-vs-pseudorandom-number-generators

https://youtu.be/aZbEF1aduok

An explanation of Linear congruential method to generate random numbers

The video was informative and the background music was eerie.

i hate the background music

Nice video, truly educational.

Love you Khan academy.

What happens at 2:28 ??? The second output is 529, not 587… Wtf???

This video was uncomfortable. But I had to watch as it was recommended by a javascript course I'm taking. Thanks!

2:06….here time is 9:37 which is converted into 121 millisecond . how is it possible?

I like this video as much as Kanye loves Kanye

I just realized that since this video is in a series, it's probable that Khan Academy defined "shifts" in an earlier video. I remember watching these last year and believe that this may be the case.

whats with the song in the background bro its just math math itn't scary … well it is but still

I didn't think this video was going to be what it is