# On Random Numbers

This is a short post which decomposes “random numbers” topic into principal components and maps them to Rust ecosystem.

## True Randomness

For cryptographic purposes (eg, generating a key pair for public key cryptography), you want to use real random numbers, derived from genuinely stochastic physical signals (hardware random number generator, keyboard input, etc). The shape of the API here is:

As this fundamentally requires talking to some physical devices, this task is handled by the operating system. Different operating systems provide different APIs, covering which is beyond the scope of this article (and my own knowledge).

In Rust, `getrandom`

crate provides a cross-platform wrapper for this functionality.

It is a major deficiency of Rust standard library that this functionality is not exposed there. Getting cryptographically secure random data is in the same class of OS services as getting the current time or reading standard input. Arguably, it’s even more important, as most applications for this functionality are security-critical.

## Pseudorandom Number Generator

For various non-cryptographic randomized algorithms, you want to start with a fixed, deterministic `seed`

, and generate a stream of numbers, statistically indistinguishable from random.
The shape of the API here is:

There are many different algorithms to do that.
`fastrand`

crate implements something sufficiently close to the state of the art.

Alternatively, a good-enough PRNG can be implemented in 9 lines of code:

This code was lifted from Rust’s standard library (source).

The best way to seed a PRNG is usually by using a fixed constant.
If you absolutely need *some* amount of randomness in the seed, you can use the following hack:

In Rust, hash maps include some amount of randomization to avoid exploitable pathological behavior due to collisions. The above snippet extracts that randomness.

## Non-Uniformly Distributed Random Numbers, Uniformly Distributed Random Non-Numbers.

Good PRNG gives you a sequence of `u32`

numbers where each number is as likely as every other one.
You can convert that to a number from 0 to 10 with `random_u32() % 10`

.
This will be good enough for most purposes, but will fail rigorous statistical tests.
Because 2^{32} isn’t evenly divisible by 10, 0 would be ever so slightly more frequent than `9`

.
There is an algorithm to do this correctly (if `random_u32()`

is very large, and falls into the literal remainder after dividing 2^{32} by 10, throw it away and try again).

Sometimes you you want to use `random_u32()`

to generate other kinds of random things, like a random point on a 3D sphere, or a random permutation.
There are also algorithms for that.

Sphere: generate random point in the unit cube; if it is also in the unit ball, project it onto the surface, otherwise throw it away and try again.

Permutation: naive algorithm of selecting a random element to be the first, then selecting a random element among the rest to be the second, etc, works.

There are libraries which provide collections of such algorithms.
For example, `fastrand`

includes most common ones, like generating numbers in range, generating floating point numbers or shuffling slices.

`rand`

includes more esoteric cases line the aforementioned point on a sphere or a normal distribution.

## Ambient Global Source Of Random Numbers

It is customary to expect existence of a global random number generator seeded for you. This is an anti-pattern — in the overwhelming majority of cases, passing a random number generator explicitly leads to better software. In particular, this is a requirement for deterministic tests.

In any case, this functionality can be achieved by storing a state of PRNG in a thread local:

## rand

`rand`

is an umbrella crate which includes all of the above.
`rand`

also provides flexible trait-based “plugin” interface, allowing you to mix and match different combinations of PRNGs and algorithms.
User interface of `rand`

is formed primarily by extension traits.

## Kinds Of Randomness

Circling back to the beginning of the post, it is very important to distinguish between the two use-cases:

- using unpredictable data for cryptography
- using statistically uniform random data for stochastic algorithms

Although the two use-cases both have “randomness” in their name, they are disjoint, and underlying algorithms and APIs don’t have anything in common. They are physically different: one is a syscall, another is a pure function mapping integers to integers.