Are Computers Really Random?

During the 1940s, in the heart of the Manhattan Project, a group of brilliant scientists were racing to simulate the behavior of nuclear particles. These weren’t just any simulations; they were part of the secret effort to build the atomic bomb.

One major problem? They needed an enormous amount of random numbers to model the probabilistic behavior of particles, something no one had ever attempted at such a scale before.

At first, researchers like Stanislaw Ulam and Nicholas Metropolis relied on actual physical randomness by flipping coins and using random mechanical devices to generate numbers. But this was painfully slow and unsustainable for the computing tasks ahead.

Enter John von Neumann.

Rather than depend on noisy hardware, von Neumann proposed a new approach: simulating randomness using deterministic arithmetic. He introduced what would become one of the first pseudorandom number generators (PRNGs), the Middle-Square Method (MSM).

The method was simple:

Start with a number (e.g., 4614),
Square it: 4614² = 21,288,996,
Extract the middle digits (e.g., 2,889),
Use that as the next input and repeat it.

Although this method eventually fell into short cycles or degenerated (sometimes landing on middle digits like 000 would get you stuck there forever), it was still a landmark idea. It offered repeatable, fast, and machine-friendly randomness at a time when the stakes were nuclear.

And yet, von Neumann knew he was cutting corners in chaos. He famously quipped:

“Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.” - John von Neumann

Was it a joke? A confession? A warning?

Whatever it was, it captured a core tension in computing: can a machine truly fake randomness?

In trying to simulate nature’s unpredictability with pure arithmetic, von Neumann gave birth to pseudorandom and committed the “first sin” in the process.

Randomness and Deterministic

Randomness is a silent partner in our digital lives. When a music app shuffles songs or a Minecraft player spawns a world rich with resources, it’s randomness or more precisely, a random seed at play. We depend on it for everything from secure passwords to unpredictable gameplay yet rarely question where this randomness actually comes from.

But cracks begin to show. In one infamous case, an employee of the Multi-State Lottery Association secretly modified the code behind a pseudorandom number generator used in lottery draws, rigging jackpots in his favor for years. In another, a well-intentioned change to Debian’s OpenSSL library drastically weakened the randomness of cryptographic keys, leaving millions of systems vulnerable to attack. Both incidents expose a disturbing truth: what appears random can often be reverse engineered when you know the system. We ask our machines for chaos, yet they are built for order.

At their core, computers are deterministic. As MIT’s Steve Ward puts it, “If you ask the same question, you’ll get the same answer every time.” This predictability is a feature, not a flaw after all; we don’t want a calculator that gives different results each time, or a phone that dials random numbers.

And yet, this very precision makes randomness seem impossible. If a number comes from a formula, how can it be random? It’s not chaos, it’s a recipe. This sets the stage for our exploration: how do computers create the illusion of randomness, and is it ever possible for them to escape their deterministic nature and touch something truly unpredictable?

Pseudorandom Number Generators (PRNGs)

The computer’s first solution to the problem of randomness is a masterful act of deception. If you cannot be random, you can act random. This is the domain of the Pseudorandom Number Generator (PRNG), an algorithm designed to produce a sequence of numbers whose properties approximate those of a truly random sequence. The numbers are pseudo (false), because they are generated by a deterministic process.

The secret to a PRNG lies in two components: the seed and the algorithm. The seed is an initial value that kickstarts the generator. The algorithm is a mathematical function that takes the current number (or “state”) and transforms it into the next one. The entire, often very long, sequence of numbers is completely determined by that initial seed. If you provide the same seed, you will get the exact same sequence of numbers, every single time. This property, known as reproducibility, is a double-edged sword. For scientific simulations or debugging code, it is an invaluable feature, allowing researchers to replicate experiments perfectly. However, for applications like cryptography or online gambling, it is a catastrophic vulnerability. If an attacker can learn the seed, they can predict the entire sequence of “random” numbers, from the past into the future. We will explore some famous PRNGs below:

The Linear Congruential Generator (LCG)

We begin our investigation with one of the oldest and simplest PRNGs: the Linear Congruential Generator (LCG). Its elegance lies in its simplicity, which also makes its flaws beautifully transparent.

An LCG generates its sequence using a simple recurrence relation. Given a current number \(X_n\), the next number \(X_{n+1}\) is calculated as:

\[X_{n+1} = (a \cdot X_n + c) \pmod{m}\]

The quality of the numbers is extremely sensitive to the choice of the parameters: the multiplier a, the increment c, and the modulus m. A poor choice can lead to sequences that are obviously non-random. A famous example of a flawed LCG is RANDU, which was widely used in the 1970s and later found to have serious defects that invalidated the results of many scientific studies.

Here’s a simple implementation of a LCG in Python:

class LCGGenerator:
    def __init__(self, seed=1, modulus=2**32, multiplier=1664525, increment=1013904223):
        self.state = seed
        self.m = modulus
        self.a = multiplier
        self.c = increment

    def next(self):
        self.state = (self.a * self.state + self.c) % self.m
        return self.state

    def normalized_next(self):
        return self.next() / self.m

The Mersenne Twister (MT)

For many years, the undisputed champion of the PRNG arms race was the Mersenne Twister, developed in 1997. It is the default PRNG in a vast array of software, including Python, R, and MATLAB. It was designed specifically to overcome the flaws of LCGs, boasting a colossal period of \(2^{19937}−1\) and excellent statistical properties across high dimensions.

Implementing the Mersenne Twister from scratch is a formidable task, as it involves complex matrix algebra over a finite field. It is far more instructive to use the trusted, standard implementation available in most programming languages. In Python, this is the random module.

Here’s a simple implementation of a MT in Python:

import random

random.seed(42)

Despite its impressive statistical resume, the Mersenne Twister has a fatal weakness: it is not cryptographically secure. Its state is based on a linear recurrence, meaning that after observing just 624 consecutive outputs, an attacker can reconstruct its internal state and predict all future numbers.

The XORShift Generator

Seeking even greater speed and simplicity, George Marsaglia introduced the XORShift family of generators in 2003. These algorithms are the sprinters of the PRNG world, prized for their incredible speed and compact code. A new number is generated by taking the previous number and applying a series of bitwise XOR and bit-shift operations.

Here’s a simple implementation of a XORShift in Python:

state = 88172645463325252 # Seed must be non-zero

def xorshift64():
    global state
    x = state
    x ^= (x << 13) & 0xFFFFFFFFFFFFFFFF
    x ^= (x >> 7) & 0xFFFFFFFFFFFFFFFF
    x ^= (x << 17) & 0xFFFFFFFFFFFFFFFF
    state = x
    return x

By themselves, simple XORShift generators fail some of the more stringent statistical tests due to their linearity. Modern variants, like xorshift+, add a final, non-linear step (such as addition) to scramble the output, which dramatically improves the statistical quality.

Cryptographically Secure PRNGs (CSPRNGs)

When the stakes are highest in cryptography, the definition of good changes entirely. It is no longer enough for a sequence to have good statistical properties. The paramount requirement is unpredictability. A Cryptographically Secure Pseudorandom Number Generator (CSPRNG) is an algorithm designed to resist prediction even when an attacker has complete knowledge of the algorithm and a long history of its previous outputs.

Implementing a CSPRNG is a task for security experts, and a core principle of secure programming is to never roll out your own crypto. Instead, you should always use the trusted, heavily-vetted CSPRNG provided by your operating system. On modern systems, this is accessed through special device files like /dev/urandom on Linux or dedicated system calls. In Python, the os.urandom() function provides direct access to this secure source.

Here’s a simple implementation of a CSPRNGs in Python:

import os

# Get 16 cryptographically secure random bytes.
# Suitable for generating a secret key.
random_bytes = os.urandom(16)

This source is continuously stirred with genuine entropy harvested from the unpredictable chaos of the computer’s operation: the precise timing of mouse movements, keyboard presses, and network packet arrivals. This constant injection of real-world unpredictability makes it secure for virtually all applications.

True Randomness

As seen above, PRNG, no matter how sophisticated, is ultimately a deterministic sequence in disguise. To find genuine, unpredictable randomness, we must escape the logical confines of the computer and tap into the physical world. This is the purpose of a True Random Number Generator (TRNG), also known as a Hardware Random Number Generator (HRNG).

A TRNG is a device that generates random numbers from a physical process rather than a computational one. The physical phenomena harnessed for this purpose are diverse and fascinating, including the chaotic jitter in electronic clock signals, the photoelectric effect, and quantum phenomena.

Cloudflare’s Wall of Entropy

Lava Lamps Randomness Source

Perhaps the most visually iconic example of a TRNG is the Wall of Entropy at Cloudflare's San Francisco headquarters. This system uses a wall of about 100 lava lamps to help secure a significant fraction of the internet traffic. A camera continuously records the chaotic and unpredictable motion of the wax blobs, and this video feed is converted into a stream of random numbers. This system exemplifies a crucial design pattern: the TRNG (the lava lamps) acts as the slow but truly unpredictable source of entropy, which is then used to seed the fast and efficient CSPRNGs running on Cloudflare’s servers.

A DIY True Random Number Generator

The principles behind TRNG are surprisingly accessible. It is possible to build one using common hardware that most computers already possess. While sources like radioactive decay require specialized equipment, and others like mouse movements can be influenced by the user, the noise from a microphone provides an excellent and readily available source of entropy.

An open microphone picks up a faint hiss composed of ambient background noise and, more importantly, the thermal noise generated by the random motion of electrons within its own electronic circuitry. This signal is fundamentally unpredictable. We can build a simple TRNG in Python by capturing this noise, extracting the least significant bits (LSBs), and whitening the result with a cryptographic hash function to remove biases.

Statistical Analysis

To understand how random these generators really are, I tested their outputs using common statistical metrics:

Mean: Should be close to \(0.5\) for numbers uniformly distributed between \(0\) and \(1\). If it’s off, the generator may produce more high or low numbers.
Variance: Measures how spread out the values are. For ideal uniform randomness, variance should be around \(0.083\). Too low means clustering; too high means unevenness.
Chi-Square Test (p-value): Checks whether the number distribution matches what we expect from a uniform distribution. A high p-value \((> 0.05)\) means we can’t reject the idea that it’s uniform.
Kolmogorov–Smirnov (K-S) Test (p-value): Checks how close the actual distribution is to the expected one across the full range. Again, higher is better.

These results show that no single generator is best for everything. A generator can look statistically perfect (like poor LCG) and still be flawed if designed poorly. Some, like CSPRNG or TRNG, may not be textbook-uniform but are still far more secure because they’re unpredictable, not just random-looking. The best tool depends on the task: fast simulations need efficiency, cryptography demands secrecy, and real entropy offers ultimate unpredictability.

Here’s the summary table:

Generator	Mean	Variance	Chi-Square p-value	K-S Test p-value	Notes
LCG (Good Parameters)	0.499664	0.082337	0.929049	0.760772	Fast, good for basic tasks
LCG (Poor Parameters)	0.506017	0.083422	0.947698	0.230356	Bias risk despite passing tests
XORShift	0.498804	0.082538	0.080238	0.583085	Fast, not secure
Mersenne Twister	0.500251	0.082759	0.364056	0.825626	Reliable for general use
CSPRNG	0.495870	0.084255	0.583243	0.194732	Best for secure applications
TRNG (Microphone)	0.501197	0.084420	0.340698	0.548200	Real-world entropy source

Here is what it means for different methods:

LCG with good parameters performs very well. It produces a balanced output and passes both tests with high confidence. It’s lightweight and suitable for simulations or simple applications.
LCG with poor parameters surprisingly passes the tests, but its mean is skewed. This means it might still have long-term bias making it unreliable for sensitive tasks.
XORShift is fast and simple, often used in games or embedded systems. But its lower p-value in the Chi-square test may hint at underlying patterns.
Mersenne Twister remains a favorite for general-purpose randomness. It offers excellent balance but is not cryptographically secure so it’s not ideal for things like passwords or encryption keys.
CSPRNG (Cryptographically Secure PRNG) is designed for unpredictability. While its mean and variance look slightly off, its core strength lies in being hard to reverse or predict.
TRNG (based on microphone noise) pulls randomness from the physical world. It performed surprisingly well, and with further filtering, it can serve as a strong entropy source or seed generator.

If you want to play around with the implementation of PRNGs and my version of TRNG, you can find it here along with test code.

What is Randomness, Really?

Note: The following section explores the deeper theoretical side of randomness ideal for readers curious about the foundations of computer science. If you’re just here for the practical story, feel free to skip ahead and still enjoy the rest of the blog.

From “How” to “What”

So far, our investigation has focused on the how: how do we build algorithms that produce sequences that look random? We have judged them based on their ability to pass statistical tests. But this approach feels incomplete. This begs a more profound question: what is randomness? Can we define it in a way that is absolute and not dependent on a particular set of tests?

Kolmogorov Complexity

The most powerful answer to this question comes not from statistics, but from theoretical computer science, specifically the field of algorithmic information theory, pioneered by Andrey Kolmogorov in the 1960s. The central idea, known as Kolmogorov complexity, is as elegant as it is profound: the randomness of an object (like a string of bits) is the length of the shortest possible computer program that can produce that object as output.

Consider two binary strings, each 32 characters long:

s1 = 01010101010101010101010101010101
s2 = 11011000101100101001000101110010

Intuitively, s1 is not random; it has a clear, simple pattern. s2, which was generated by flipping a coin, appears random. Kolmogorov complexity formalizes this intuition.

The shortest program to produce s1 is something like print("01" * 16). This program is much shorter than the string itself. Therefore, s1 has low Kolmogorov complexity.

For s2, there is no obvious pattern. The shortest program we can think of is likely print("11011000101100101001000101110010"), which is essentially the same length as the string itself. Therefore, s2 has high Kolmogorov complexity.

This leads to the ultimate definition of algorithmic randomness. A string is defined as incompressible (or Kolmogorov random) if its Kolmogorov complexity is greater than or equal to its length. An incompressible string is one that cannot be described in any way that is more concise than simply stating the string itself. It is pure information, with no redundancy or discernible patterns that a computer could exploit to compress it.

This powerful definition comes with a stunning twist: the Kolmogorov complexity of a string is uncomputable. There is no algorithm that can take an arbitrary string and tell you its complexity. This means we can have a perfect definition of randomness, but we can never be certain that any given string satisfies it.

Does Randomness Make Computers More Powerful?

This deep dive into theory brings us back to a practical question for computer science: does having access to a source of perfect randomness let us solve problems more efficiently? This is the essence of one of the most significant open questions in complexity theory: P vs BPP.

Let’s define these classes intuitively:

P (Polynomial Time): This class contains all decision problems (questions with a yes/no answer) that can be solved efficiently by a deterministic computer.
BPP (Bounded-Error Probabilistic Polynomial Time): This class contains all decision problems that can be solved efficiently by a computer that has access to a source of random bits, with an error probability bounded below 1/2 (e.g., 1/3). A classic example of a problem in BPP is primality testing.

Clearly, P ⊆ BPP, because a deterministic algorithm is just a probabilistic one that ignores its random bits. The monumental question is whether BPP is any larger than P. The overwhelming consensus among computer scientists is that it is not. They conjecture that P = BPP i.e. randomness doesn’t fundamentally add power to efficient computation.

If this conjecture is true, it would mean that randomness, while an incredibly useful tool for designing simple and elegant algorithms, does not grant computers any fundamental new power. The primary path to proving P = BPP is through derandomization: constructing a PRNG so powerful that it can fool any efficient algorithm. If such a generator exists, we could replace the true random bits in any BPP algorithm with the output of this PRNG, turning it into a deterministic P algorithm and proving the two classes are equal.

The Final Verdict

We return to our initial question: Are computers really random? The investigation has revealed a nuanced answer. A computer, as a purely logical, deterministic Turing machine, cannot create randomness from nothing. In this strict sense, the answer is no. Its own thoughts are bound by algorithms and are ultimately predictable.

However, this is a limited view. A computer is not an isolated brain; it is a physical device connected to the world. When equipped with sensors (a camera watching lava lamps, a microphone listening to thermal noise) it can become a perfect conduit for the randomness inherent in the physical universe. It can harvest this unpredictability, refine it, and put it to use. In this practical and more meaningful sense, the answer is a resounding yes. Computers can provide us with randomness of the highest quality, so long as they are listening to the right source.

Our journey has taken us from the elegant but flawed patterns of the LCG, through the escalating arms race of PRNGs, and into the profound theoretical depths of Kolmogorov complexity and the P vs. BPP problem. We saw that the pursuit of randomness has forced computer science to confront its own fundamental limits.

The quest for randomness in computing reveals a beautiful and deep truth. We build these machines to be paragons of logic, precision, and order. Yet, to perform some of their most critical tasks from securing our data to modeling our world they must embrace chaos.

The ultimate source of unpredictability isn’t found in a cleverer algorithm or more complex formula. It lies in the machine’s ability to open a window to the physical world and listen to the ceaseless, chaotic hum of the universe.

Randomness, then, is not something a computer generates but something it discovers. It is the interface between the sterile, logical world of computation and the messy, unpredictable reality we inhabit. In the end, a computer can only be as random as the universe it is a part of.

And yet, most of the time, that’s okay because pseudorandom generators are good enough for nearly everything we ask of them. They’re fast, controllable, and astonishingly effective at mimicking chaos.

But if even our machines must borrow unpredictability from the cosmos, can anything truly be random without touching the real world?

Feel free to share your thoughts in the comments below and don’t forget to leave a reaction!