[LLVMbugs] [Bug 23168] New: uniform_real_distribution isn't uniform.

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Thu Apr 9 01:53:14 PDT 2015


https://llvm.org/bugs/show_bug.cgi?id=23168

            Bug ID: 23168
           Summary: uniform_real_distribution isn't uniform.
           Product: libc++
           Version: unspecified
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: All Bugs
          Assignee: unassignedclangbugs at nondot.org
          Reporter: art at blahonga.org
                CC: llvmbugs at cs.uiuc.edu, mclow.lists at gmail.com
    Classification: Unclassified

Looking at the implementation of uniform_real_distribution it's quite obvious
that it isn't respecting the pigeonhole principle. And since lots of other
distributions are built on top of uniform_real_distribution, all those other
distributions are broken too.

Here's a very simple test:

$ cat urd.cxx
#include <random>
#include <iostream>

#include <stdio.h>
#include <assert.h>

int
main()
{
        std::default_random_engine gen;
        double from = 0x1p52, to = from + 2;
        std::uniform_real_distribution<double> dis(from, to);
        long long bucket[3] = { 0 };
        int i;

        for (i = 0; i < 1000000; i++) {
                double r = dis(gen);
                unsigned int b = (int)(r - from);
                if (b >= 3) {
                        printf("foo: %d %f\n", b, r);
                }
                assert(b < 3);
                bucket[b]++;
        }

        for (i = 0; i < 3; i++) {
                printf("%lld\n", bucket[i]);
        }


        return 0;
}
$ c++ -v
Apple LLVM version 6.0 (clang-600.0.57) (based on LLVM 3.5svn)
Target: x86_64-apple-darwin14.1.0
Thread model: posix
$ c++ -std=c++11 -o foo urd.cxx && ./foo
249701
500046
250253


I looked through the code available on the svn trunk on the web and it hasn't
changed enough to fix this.

The reason 0x1p52 is the start of the range is because at [2^52,2^53) doubles
have exactly integer precision which allows the generator only three possible
output values. Any other range could be chosen and display the same problem as
long as we constrict the generator to few (non 2^n) outputs. Three possible
outputs will display the greatest bias in the output.

GCC libstdc++ is broken too in the exact same way.

FYI. A while ago I started an attempt to make a decent solution to a small part
of this problem because it's harder than it looks:
https://github.com/art4711/random-double/blob/master/rd.c

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150409/f66ccb9d/attachment.html>


More information about the llvm-bugs mailing list