Skip to main content

Random Numbers and Modulo Bias

I recently wanted to generate some cryptographically secure random numbers. I ended up looking at some Perl code for this, and found an interesting comment in a function used for generating random numbers between 0 and a limit:

# A random number between 0 and 2^32 - 1.
my $int = $RNG->irand();

# We can't just use the mod operator because
# it will bias our output. Search for "modulo
# bias" on the Internet for details. This is
# slower than mod(), but does not have a bias.
return int(($int / 2**32) * $limit);

I found this interesting – at first I didn’t get why using modulo made the random numbers generated less secure.

It turns out that the reason goes something like this: let’s say we can generate a random integer x that’s between 0 and 7, and we want a random integer y between 0 and 2.

The natural solution would be to do something like y = x % 3. Let’s have a look at all the possible outcomes when we do this:

x 0 1 2 3 4 5 6 7
y = x % 3 0 1 2 0 1 2 0 1

From this, we can see there is a 37.5% chance of getting y = 0, a 37.5% chance of getting y = 1, but only a 25% chance that we get y = 2. Clearly, the modulo approach is unsuitable if we want the random integers we generate to not be predictable.

When looking at the table, this makes sense, since the number of possible values of y (three) doesn’t divide evenly into the number of possible values of x (eight).

We can fix this by discarding the generated value of x if its value is 6 or 7. Now, the number of possible values of y divides evenly into the number of possible values of x, so we have a one in three chance of getting 0, 1, or 2.

my $x;
do {
  $x = get_random_integer(); # between 0 and 7
} while ($x >= 6);
my $y = $x % 3;

In fact, it turns out the Perl code at the top of this post isn’t secure for very large limits. For example, if $limit is roughly two thirds of 232, even results will be about twice as likely as odd results. However, for small limits (like 3 in our example), that approach is probably fine because the bias would be negligible.