Mostly out of my head; feel free to correct me if you spot an error:

- d1901fa3176ffd7d77f2cd4dde125829
- The most common use for random data in computer security is as a secret: a password, a key, or a session ID for instance.
- Random data is the result of a random process.
- An ideal toss of a coin generates 1 bit of random data.
For**[Update 2009-11-27:**Or maybe not.**]***n*tosses there are 2^{n}different strings of n bits that could be the result. - Producing random data is hard on a mostly deterministic machine, particularly if large quantities are needed.
- Data that
*looks*random doesn’t have to be. For example, base64-encoded data looks random to some people although it isn’t. - The output of a random process has certain statistical properties that can be tested.
- A deterministic process can produce output that exposes the same statistical properties as a truly random process for any finite number of bits.
- It is therefore impossible, for any finite string of bits, to prove that it is the result of a random process.
- It is, however, possible to detect and reject as non-random the results of some deterministic processes on the basis of statistical tests.
- A deterministic process designed to produce output that passes some of the statistical randomness tests is called a pseudorandom number generator (PRNG).
- There are different qualities of pseudorandom number generators. The simpler ones are unsuitable for security applications.
- Data from a PRNG that looks random according to statistical test can still be entirely predictable if the PRNG is known and its internal state can be worked out.
- The whole point of using random data in security is to make guessing hard. The idea is that to guess a random secret an attacker will have to search the entire space of possible values in the worst case and half of it in the average case.
- Attackers cannot be prevented from guessing secrets. If there is an easier way than guessing, the attacker is assumed to prefer the easier way.
- The inner state of some PRNGs can be determined from a small number of subsequent output values.
- Up to a certain length and under a number of side conditions, the output of cryptographic primitives such as hash functions or ciphers are expected to expose the statistical properties of random data.
- Pseudorandom number generators for security purposes often combine a source of random data with cryptographic methods to produce an almost arbitrary amount of random data. True random data is used to seed the generator, which then expands the data.
- Potential sources of random data in a deterministic computer system are external events and digitized noise from analog circuits.
- Using digitized noise isn’t easy either. The process might be biased, limiting the randomness obtained.
- Humans are poor random number generators unless they execute a random process, e.g. repeatedly tossing a coin.
- Stream ciphers combine a (pseudo)random key stream with the data stream using bitwise XOR.
- A string of truly random data as a message has maximum entropy. Such random data thus cannot be compressed.
- Random data can also be used as test input into software components. This type of testing is called
*fuzzing*. - For an ideal block cipher, flipping a single bit in either the clear text or the key flips each bit of the cipher text with probability 0.5.

Note that I’m sloppy in my use of the term *random*. Randomness can mean a lot of different things and I’m really making a few assumptions.