[PATCH] Implement std::experimental::sample

Eric Fiselier eric at efcs.ca
Thu Apr 30 15:18:40 PDT 2015


I'm fairly confident in the selection sampling algorithm as well. However I'm not quite ready to sign off on it. The algorithm as Knuth describes it generates its random numbers slightly differently.

Knuth lists the following steps for selection sampling:

- Generate a random number `U` uniformly distributed between zero and one.
- if `(unsampled_size * U) < remaining_inserts` then choose that element.

However this implementation of `sample` does:

- Generate a random number `U` uniformly distributed between zero and `unsampled_size - 1`.
- if `U < remaining_inserts` then choose that element.

My concern is that semantic change could introduce a bias. However I'm bad at math and stats so I don't know.
I want to think on it a little more to make sure they are equivalent.

I've also created a patch of various changes and additions related to adding the `<experimental/algorithm>` header and making the tests run in C++03 (this invalidates the above patch.). Please consider applying this to your patch.
https://gist.github.com/EricWF/c62b0979202cfd2cc6af


================
Comment at: include/experimental/algorithm:73
@@ +72,3 @@
+  for (__n = _VSTD::min(__n, __unsampled_sz); __n != 0; ++__first) {
+    _Distance __r =
+        _VSTD::uniform_int_distribution<_Distance>(0, --__unsampled_sz)(__g);
----------------
Same note about using `_Distance`

http://reviews.llvm.org/D9044

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the cfe-commits mailing list