Pseudo Random Number Generator. More...
#include <libecc/rng.h>
Public Types | |
typedef bitset< S_pool_size > | pool_type |
The type of the constructor argument. | |
Public Member Functions | |
rng (pool_type const &seed) | |
Constructor. | |
void | generate_512_bits (void) |
Generate 512 bits. | |
bitset< 512 > const & | get_512_bits (void) const |
Retrieve a reference to the 512 bits that were generated with a call to rng::generate_512_bits(). | |
void | add_entropy (uint32_t const *noise, unsigned int number_of_ints) |
Add entropy to the random number generator pool. | |
template<unsigned int n> | |
void | add_entropy (bitset< n > const noise) |
Add entropy from a bitset. | |
Static Public Attributes | |
static unsigned int const | S_pool_size |
Pool size in bits. This RNG uses a pool of 521 bits. |
Pseudo Random Number Generator.
This random number generator was designed from scratch. It is fast (2.5 million bit/second on a 900 Mhz PC), has a long period (2^521 - 1) and passes all known statistical tests for randomness.
The basis of this Random Number Generator (RNG) is a shift register of 521 bit using nine feedback points. Because such a feedback is linear, it is possible to write any amount of shift as NEWSTATE = MATRIX * OLDSTATE
. The matrix that corresponds with a single shift can easily be determined as function of the feedback points. Let's call this matrix M0
. The matrix corresponding to a shift of 2 then corresponds to M1 = M0 * M0
. A shift of 4 corresponds to M2 = M1 * M1
, etcetera.
The period of the RNG will be equal to the number of different internal states that will be used. In the optimal case all possible internal states, except all zeroes which would lead to an output of only zeroes, will be used and the period of the RNG will be equal to 2^{521} - 1. This hypothesis can be checked by confirming that M521 = M0^{2521} == M0
and because 2^{521} - 1 is a prime this value then must be the real period of the RNG. Of course, one of the feedback points is fixed at bit 520 (feeding to bit 0). By choosing the other feedback points close to bit 0, a very fast mangling of the bits is achieved.
Suppose we have a shift register of 3 bits (2^{3}-1 = 7 is a prime). Let's use one feedback point from bit 1 (and from bit 2 of course), then we have:
bit 0 1 2(output) state0: a b c state1: b c a^b state2: c a^b b^c ...
Which gives, with a starting seed of 001:
001 010 101 011 111 110 100 and back to the top
Or rotated 90 degrees:
bit 0 0010111 1 0101110 2 1011100
Each column can be considered as a vector:
a b c
The matrix representing one shift now is:
m11 m12 m13 a a m11 + b m12 + c m13 b m21 m22 m23 b = a m21 + b m22 + c m23 = c m31 m32 m33 c a m31 + b m32 + c m33 a+b mod 2 = a^b
From which we can deduce the matrix
0 1 0 M0 = 0 0 1 1 1 0
and that we have to do our calculation modulo 2.
Note that
0 1 0 0 1 0 0 0 1 M1 = M0 * M0 = 0 0 1 0 0 1 = 1 1 0 1 1 0 1 1 0 0 1 1
0 0 1 0 0 1 0 1 1 M2 = M1 * M1 = 1 1 0 1 1 0 = 1 1 1 0 1 1 0 1 1 1 0 1
and
0 1 1 0 1 1 0 1 0 M3 = M2 * M2 = 1 1 1 1 1 1 = 0 0 1 = M0 1 0 1 1 0 1 1 1 0
Which proves that this repeats after 2^{3}-1 = 7 times.
Because the state of the RNG changes in a linear way (we can use matrices), the output is predictable: the internal state can be calculated from a small amount of output.
In the above example assume that the RNG produces the output sequence xyz. If the orginal state was abc then after three bits of output the state would be:
a 0 1 0 0 0 1 a 1 1 0 a a+b M0 M1 b = 0 0 1 1 1 0 b = 0 1 1 b = b+c (mod 2) c 1 1 0 0 1 1 c 1 1 1 c a+b+c
while the output would be
x 0 0 1 ^{ } a 0 0 0 ^{ } a 0 0 0 ^{ } a 1 1 0 a y = 0 0 0 M0^{1} b + 0 0 1 M0^{2} b + 0 0 0 M0^{3} b = 0 1 1 b z 0 0 0 ^{ } c 0 0 0 ^{ } c 0 0 1 ^{ } c 1 1 1 c
The inverse of the latter allows one to crack the RNG (calculate the current state from x,y,z).
Although it might sound easy to check if the period of a given RNG is maximal, in practise it still isn't when we want to use a huge number of bits. I wrote a program that does 4 Gigabit operations per second, it was therefore able to do 3 times 127 matrix multiplications of 127x127 matrices per second. That means that it is reasonably fast to look for working feedback points. Let's say, one minute was enough. But if then we try to use this program to find feedback points for a shift register of 1279 bits, it suddenly takes (1279/127)^{4} = 10286 minutes = 1 week! And 1279 bits is still not really big... Moreover, storing large matrices uses a lot of memory. A matrix of 19937 bits (2^{19937} - 1 is the period of the Mersenne Twister) would use 47.4 Mb per matrix already.
Fortunately, the matrices that we need to prove the period of are not arbitrary. As can be deduced from the paragraphs above, M0 can be written as
0 1 0 0 0 0 0 ... 0 0 0 0 1 0 0 0 0 ... 0 0 0 0 0 1 0 0 0 ... 0 0 . M0 = . . 0 0 0 0 0 0 0 ... 1 0 0 0 0 0 0 0 0 ... 0 1 1 0 0 ... 1 ... 1 ...
Where the 1's on the bottom row are at the feedback point places.
As a result, we can write M0 (for a pxp matrix) as a vector of rows as follows
(0 1 0 0 0 0 0 ... 0) (0 1 0 0 0 0 0 ... 0) M0^{1} (0 1 0 0 0 0 0 ... 0) M0^{2} (0 1 0 0 0 0 0 ... 0) M0^{3} M0 = . . . (0 1 0 0 0 0 0 ... 0) M0^{p-1}
If we multiply this an arbitrary number of times with M0 (say k times), then we get:
^{ } (0 1 0 0 0 0 0 ... 0) M0^{k } (0 1 0 0 0 0 0 ... 0) M0^{k} ^{ } (0 1 0 0 0 0 0 ... 0) M0^{1} M0^{k} (0 1 0 0 0 0 0 ... 0) M0^{k} M0^{1} ^{ } (0 1 0 0 0 0 0 ... 0) M0^{2} M0^{k} (0 1 0 0 0 0 0 ... 0) M0^{k} M0^{2} ^{ } (0 1 0 0 0 0 0 ... 0) M0^{3} M0^{k} (0 1 0 0 0 0 0 ... 0) M0^{k} M0^{3} M0 M0^{k} = . = . ^{ } . . ^{ } . . ^{ } (0 1 0 0 0 0 0 ... 0) M0^{p-1} M0^{k} (0 1 0 0 0 0 0 ... 0) M0^{k} M0^{p-1}
which proves that every matrix that is a power of M0 can be expressed in terms of the top row of the matrix and M0 as follows:
Row(N+1) = Row(N) M0
Example
Consider a RNG of 7 bits with a single feedback points at bit 7 - 3 = 4. The first 7 powers of M0 for that RNG are
^{ } 1 0 0 0 0 0 0 ^{ } 0 1 0 0 0 0 0 ^{ } 0 0 1 0 0 0 0 M0^{0} = 0 0 0 1 0 0 0 ^{ } 0 0 0 0 1 0 0 ^{ } 0 0 0 0 0 1 0 ^{ } 0 0 0 0 0 0 1
^{ } 0 1 0 0 0 0 0 ^{ } 0 0 1 0 0 0 0 ^{ } 0 0 0 1 0 0 0 ^{ } 0 0 1 0 0 0 0 ^{ } 0 0 0 1 0 0 0 ^{ } 0 0 0 0 1 0 0 ^{ } 0 0 0 1 0 0 0 ^{ } 0 0 0 0 1 0 0 ^{ } 0 0 0 0 0 1 0 M0^{1} = 0 0 0 0 1 0 0 , M0^{2} = 0 0 0 0 0 1 0 , M0^{3} = 0 0 0 0 0 0 1 ^{ } 0 0 0 0 0 1 0 ^{ } 0 0 0 0 0 0 1 ^{ } 1 0 0 0 1 0 0 ^{ } 0 0 0 0 0 0 1 ^{ } 1 0 0 0 1 0 0 ^{ } 0 1 0 0 0 1 0 ^{ } 1 0 0 0 1 0 0 ^{ } 0 1 0 0 0 1 0 ^{ } 0 0 1 0 0 0 1
^{ } 0 0 0 0 1 0 0 ^{ } 0 0 0 0 0 1 0 ^{ } 0 0 0 0 0 0 1 ^{ } 0 0 0 0 0 1 0 ^{ } 0 0 0 0 0 0 1 ^{ } 1 0 0 0 1 0 0 ^{ } 0 0 0 0 0 0 1 ^{ } 1 0 0 0 1 0 0 ^{ } 0 1 0 0 0 1 0 M0^{4} = 1 0 0 0 1 0 0 , M0^{5} = 0 1 0 0 0 1 0 , M0^{6} = 0 0 1 0 0 0 1 ^{ } 0 1 0 0 0 1 0 ^{ } 0 0 1 0 0 0 1 ^{ } 1 0 0 1 1 0 0 ^{ } 0 0 1 0 0 0 1 ^{ } 1 0 0 1 1 0 0 ^{ } 0 1 0 0 1 1 0 ^{ } 1 0 0 1 1 0 0 ^{ } 0 1 0 0 1 1 0 ^{ } 0 0 1 0 0 1 1
And indeed, each second row of a sequential power of M0 is equivalent to a next row of M0 itself.
Now let's consider an arbitrary power k of M0 (in the example below k = 48).
^{ } 0 0 1 1 0 1 0 ^{ } 0 0 0 1 1 0 1 ^{ } 1 0 0 0 0 1 0 M0^{k} = 0 1 0 0 0 0 1 ^{ } 1 0 1 0 1 0 0 ^{ } 0 1 0 1 0 1 0 ^{ } 0 0 1 0 1 0 1
and let's call the first row C_{k}^{T}
_{ } 0 _{ } 0 _{ } 1 C_{k} = 1 _{ } 0 _{ } 1 _{ } 0
then
M0^{k} = C_{k}^{T} P
where P
is the vector of matrices
M0^{0} M0^{1} M0^{2} P = M0^{3} M0^{4} M0^{5} M0^{6}
Let I_{n}
be the n-th column of I
and P_{n} = P I_{n}
, a vector of the n-th column of each of the matrices of P
. Then the n-th column of M0^{k} is
M0^{k} I_{n} = C_{k}^{T} P I_{n} = C_{k}^{T} P_{n}
or
I_{n}^{T} (M0^{k})^{T} = P_{n}^{T} C_{k}
Recall that C_{k}^{T}
is the top row of M0^{k}
. That means we can write
C_{k} = (I_{0}^{T} M0^{k})^{T} = (M0^{k})^{T} I_{0}
and can deduce that
C_{m+k} = (M0^{m+k})^{T} I_{0} = (M0^{m} M0^{k})^{T} I_{0} = (M0^{k})^{T} (M0^{m})^{T} I_{0} = (M0^{k})^{T} (I_{0}^{T} M0^{m})^{T} = (M0^{k})^{T} C_{m} = [sum over n] (I_{n}^{T} (M0^{k})^{T} C_{m} I_{n}) = [sum over n] (P_{n}^{T} C_{k} C_{m} I_{n}) = [sum over n] (I_{n}^{T} P^{T} C_{k} C_{m} I_{n}) = P^{T} C_{k} C_{m}
from which we conclude that bit n of C_{k+m}
is
I_{n}^{T} C_{k+m} = I_{n}^{T} P^{T} C_{k} C_{m} = P_{n}^{T} C_{k} C_{m}
Note that P_{n}^{T}
is a row of rows, hence P_{n}^{T} C_{k}
is a row which can also be written as C_{k}^{T} Q
. The n-th bit of C_{k+m}
then becomes
I_{n}^{T} C_{k+m} = C_{k}^{T} Q_{n} C_{m}
where Q_{n}
is a matrix consisting of the n-th columns of the elements of P
, in other words
Q_{n} = Q I_{n}
where
Q = (M0^{0} M0^{1} M0^{2} ... M0^{p-1})
Note that Q_{n} is symmetric so that Q_{n}^{T} = Q_{n}
and therefore C_{k}^{T} Q_{n} C_{m} = C_{m}^{T} Q_{n} C_{k}
, as it should be since M0^{k} M0^{m} = M0^{m} M0^{k}
.
Using this method it was possible to write a program that checks the period of the RNG in a time of the order O(p^{2} * f^{3}), for pxp matrices with f feedback points, instead of the order O(p^{4}). Note that the following observations have been used as well; the period of a RNG with feedback points f1, f2, f3 ... is equal to the period of a RNG with feedback points p - f1, p - f2, p - f3, ... This allowed us to drastically reduce the order of the number of feedback points. Finally note the observation that a RNG only has a period of 2^{p} - 1 solution when using an odd number of feedback points.
Considering that it took me something of the order of an hour to find working feedback points for a RNG with 521 bits, it would now take about 9 weeks to find feedback points for a RNG of 19937 bits instead of 262 year as would be the case when using ordinairy matrices.
Implementation
The implemented Random Number Generator of libecc uses 521 bits and 9 feedback points at 2, 3, 7, 13, 31, 61, 131, 151 and 251. These feedback points are chosen to be primes in order to garantee the least interference. The distance between the feedback points is every time increased by a factor of two (except for the feedback point at 151 which was added in order to make the RNG have its maximum period). The reason for this is again to have the least interference between feedback frequencies, this way the different feedback points nicely supplement each other in achieving bit mangling over the full range.
typedef bitset<S_pool_size> libecc::rng::pool_type |
The type of the constructor argument.
libecc::rng::rng | ( | pool_type const & | seed | ) |
Constructor.
seed | A bitset of 521 bits, any value is allowed except all zeroes. |
void libecc::rng::add_entropy | ( | uint32_t const * | noise, | |
unsigned int | number_of_ints | |||
) |
Add entropy to the random number generator pool.
By adding entropy to the random number generator, the output becomes more unpredictable. Its not really necessary to do this however, its hard enough to 'guess' the 521 bits of seed. It is advisable to use SHA-1 on the output though, in order to make it harder to reverse engineer the internal state of the rng from its output.
void libecc::rng::add_entropy | ( | bitset< n > const | noise | ) | [inline] |
Add entropy from a bitset.
void libecc::rng::generate_512_bits | ( | void | ) |
Generate 512 bits.
Fills the internal output buffer with 512 new random bits. You can retrieve this output with the member function rng::get_512_bits().
bitset<512> const& libecc::rng::get_512_bits | ( | void | ) | const [inline] |
Retrieve a reference to the 512 bits that were generated with a call to rng::generate_512_bits().
unsigned int const libecc::rng::S_pool_size [static] |
Pool size in bits. This RNG uses a pool of 521 bits.