This is the mail archive of the gsl-discuss@sources.redhat.com mailing list for the GSL project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: rand_rate


On Wed, 10 Nov 2004, Linas Vepstas wrote:

> On Tue, Nov 09, 2004 at 11:50:12PM -0500, Robert G. Brown was heard to remark:
> > I mentioned my random number tester program, 
> > 
> >   http://www.phy.duke.edu/~rgb/General/rand_rate.php
> > 
> > outputting a "report" that includes a p-value (or set of p-values) and a
> > fairly arbitrary conclusion about whether or not the rng is likely to be
> > questionable for people who can't interpret p-values, for the test being
> > run.
> > 
> > It is still very much a project in progress -- I'm missing lots (most)
> > of the STS and diehard tests -- but it does provide a framework for
> 
> So what does it test for currently?  (could you update the web page to 
> state this?)  I've been fiddling with something that seems to be 
> like a prng (it seems to  pass basic tests) but its not a traditional
> polynomial curve/cryptographic prng;  I want to understand if/how it 
> falls down. (its rather bizarre) I know how its weak in certain cases; 
> but want to understand if/how it might pass traditional tests.

sts monobit, sts runs, a variant of the former of my own that actually
checks to see if the bit count in strings is binomially distributed
(checks the whole binned histogram) -- it doesn't just look at the mean
but at the entire distribution, so it can in principle catch things with
the right mean monobit frequency but that has different moments.  A
bitpair test that checks the frequency of 00,01,10,11 compared to their
binomial probabilities in cyclic bit strings.  I'm working on the
diehard runs test, which generates a histogram of the number of
sequentially ascending or decending runs of numbers output by the rng,
compared to theoretically expected values.  I'm also planning to extend
the bit distribution test to test for the correct frequency of bit
triplets, quadruplets,...n-tuples as I think that this would replace
several diehard tests with a single rational process.

The idea behind rng tests is very straightforward and is detailed in the
STS white paper that accompanies their test suite.  There are all sorts
of distributions that can be generated from a string of truly random
bits (or integers, or floats).  Pick a such a distribution, any
distribution, with a known method for generating it from random numbers
and known statistical properties e.g. mean and variance.  Generate the
distribution from the pseudorandom number generator, compare the result
to the known distribution and generate a chisq, convert the chisq (for
normal/CLT-related means) into a probability p that a good random number
generator could produce the observed distribution.  

If p is bigger than very small, you can't say much about the underlying
rng.  Could be good, could be bad, but no reason to think it is bad.  If
p is small -- say less than 0.05 or 0.01 -- then there is less than a 5%
or 1% chance that a good distribution would have produced the result and
the rng MIGHT be bad.  If you run the test lots of times with lots of
seeds and get an abnormal number of low values of p (if p itself isn't
correctly and consistently distributed) eventually you conclude that the
rng is bad.

Repeat for lots of exotic distributions seems to be both the diehard
rule and the sts rule; one thing I'm thinking about is whether a lot of
the distributions they check are effectively equivalent to much simpler
hierarchical bit tests (like directly checking the frequency
distribution of n-bit combinations in all embedded substrings of bits).
I'm thinking about this because some of their tests -- e.g. looking for
and counting points where a 000.. bit string connects to a ...11 bit
string or vice versa -- can be shown to be absolutely equivalent to
checking the 2-tuple distribution function, with each 2-tuple appearing
1/4 of the time (within binomial sigma).

There are a few exceptions to this test rule, one of them in place in my
code.  Certain bad rngs are bad because for certain seeds they have bits
that don't change at all, for example.  I have a sliding mask test that
masks out bits that don't vary for certain seeds for a given rng,
something that kills the rng right there.  randu, for example, fails
this miserably (as it fails so many tests -- it is good to have as a bad
generator:-).

Alas I don't have anywhere near all of the tests in the tool yet -- it
isn't difficult to add a test as you can see above, and tests mostly
share a similar structure involving computation of a quantity or a
vector of quantities and either an erfc or incomplete gamma to convert
chisq into a p-value.  The only tricky part is being able to evaluate
the expected value and (approximately normal) sigma for each quantity
you compute so you can tot up chisq and p mechanically later.  Still,
I'm teaching and doing other research, so I have only limited time to
spend on this.  If anybody LIKES this tool and wants to contribute a
routine for a diehard or sts or homebrew test, I'll cheerily give it a
number and embed it into the tool, with attribution.  The only rule I
have is that the submissions have to be full GPL, since I'm linking them
to the GSL and since my own code is full (viral) GPL.  That means
rewriting e.g. Diehard or STS from their descriptions because the
released code isn't GPL, as far as I can tell, although neither is it
restricted.  The diehard source is all f2c generated from evil fortran,
anyway, so rewriting it sanely is essential anyway.

One other class of test I plan to eventually add are spectral tests and
searches for hyperplanes in N dimensions -- there is nice code and
algorithm description out there to do this and the idea itself is very
straightforward -- certain classes of rng generate points that lie on
hyperplanes in N dimensions, and you can categorize them by the
dimension N where the hyperplanes first appear and they "geometrically"
fail.  Not there yet.

So this is very much a tool in progress, and may have bugs in it (some
tests have constraints like not specifying size and bits at the same
time and I've been slow in running all those cases down and preventing
them).  It is still kind of fun to watch weak generators fail and to
stress the strong ones to see if they'll fail.

   rgb

> 
> --linas
> 

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb@phy.duke.edu



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]