[libcxx-commits] [PATCH] D143816: [libc++] Fixes a flacky test.

Mark de Wever via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Wed Feb 15 10:15:31 PST 2023


Mordante created this revision.
Mordante added a reviewer: ldionne.
Herald added a project: All.
huixie90 added a comment.
Mordante updated this revision to Diff 496696.
Mordante updated this revision to Diff 497717.
Mordante edited the summary of this revision.
Mordante updated this revision to Diff 497726.
Mordante marked an inline comment as done.
Mordante published this revision for review.
Herald added a project: libc++.
Herald added a subscriber: libcxx-commits.
Herald added a reviewer: libc++.

If all the threads exit before the `cv.wait` is called. `cv.wait` would immediately return as the predicate returns `true`.


Mordante added a comment.

CI fixes.


Mordante added a comment.

In D143816#4120396 <https://reviews.llvm.org/D143816#4120396>, @huixie90 wrote:

> If all the threads exit before the `cv.wait` is called. `cv.wait` would immediately return as the predicate returns `true`.

Good point, I had looked at the overload without a predicate.

Still I think this test is better, we want to make sure `notify_all_at_thread_exit` does a notification. Depending on the OS scheduling the previous test may pass without any notification. Which would happen when the predicate is true at its first evaluation.


huixie90 added a comment.

In D143816#4120464 <https://reviews.llvm.org/D143816#4120464>, @Mordante wrote:

> In D143816#4120396 <https://reviews.llvm.org/D143816#4120396>, @huixie90 wrote:
>
>> If all the threads exit before the `cv.wait` is called. `cv.wait` would immediately return as the predicate returns `true`.
>
> Good point, I had looked at the overload without a predicate.
>
> Still I think this test is better, we want to make sure `notify_all_at_thread_exit` does a notification. Depending on the OS scheduling the previous test may pass without any notification. Which would happen when the predicate is true at its first evaluation.

The fact that the test can hang might indicate there might be a problem either in our libc++ , or the OS, if we cannot prove that the test is not correct.

Maybe we should try to figure that out as well


Mordante added a comment.

In D143816#4120523 <https://reviews.llvm.org/D143816#4120523>, @huixie90 wrote:

> In D143816#4120464 <https://reviews.llvm.org/D143816#4120464>, @Mordante wrote:
>
>> In D143816#4120396 <https://reviews.llvm.org/D143816#4120396>, @huixie90 wrote:
>>
>>> If all the threads exit before the `cv.wait` is called. `cv.wait` would immediately return as the predicate returns `true`.
>>
>> Good point, I had looked at the overload without a predicate.
>>
>> Still I think this test is better, we want to make sure `notify_all_at_thread_exit` does a notification. Depending on the OS scheduling the previous test may pass without any notification. Which would happen when the predicate is true at its first evaluation.
>
> The fact that the test can hang might indicate there might be a problem either in our libc++ , or the OS, if we cannot prove that the test is not correct.
>
> Maybe we should try to figure that out as well

I fully agree. That's why the review is still in draft state. But I wanted to test whether this fix passes the CI for all platforms. I'm still looking into it.


Mordante added a comment.

Rewritten the test based on a private discussion with @ldionne.


Mordante added a comment.

Minor comment improvement.



================
Comment at: libcxx/test/std/thread/thread.condition/notify_all_at_thread_exit_lwg3343.pass.cpp:18
+// UNSUPPORTED: use_system_cxx_lib && target={{.+}}-apple-macosx10.{{.+}}
+// UNSUPPORTED: use_system_cxx_lib && target={{.+}}-apple-macosx11.{{.+}}
+
----------------
@ldionne please verify whether these targets are correct. These are the same values I use for the format backdeployment tests.


================
Comment at: libcxx/test/std/thread/thread.condition/notify_all_at_thread_exit_lwg3343.pass.cpp:12
 // notify_all_at_thread_exit(...) requires move semantics to transfer the unique_lock.
 // UNSUPPORTED: c++03
 
----------------
I think this needs to be marked as UNSUPPORTED on the back-deployment target. The fix is in the dylib so we should expect it to fail on back-deployment targets.

I can't believe I didn't think of that earlier, I'm sorry I hadn't realized the original fix was in `thread.cpp`, not in `<thread>`.


While investigating the flaky tests on the mac backdeployment targets
it seems the tests are different than suggested in the LWG issue.
That tests "does work", grabs the mutex, marks the task as done, and
finally calls the notifier.
Our test emulated "does work" after the notification, effectively
sleeping with a lock held. This has been fixed.

A second improvement is that the test fails when, due to OS
scheduling, the condition variable is not used in the main thread.

During discussing the approach of the patch with @ldionne, he
noticed the real reason why the test is flaky; the Apple
backdeployment targets use a dylib, which does not contain the
fix. So the test can't be tested on that platform; it only
proves the LWG issue was correct and the Standard contained
a bug.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D143816

Files:
  libcxx/test/std/thread/thread.condition/notify_all_at_thread_exit_lwg3343.pass.cpp


Index: libcxx/test/std/thread/thread.condition/notify_all_at_thread_exit_lwg3343.pass.cpp
===================================================================
--- libcxx/test/std/thread/thread.condition/notify_all_at_thread_exit_lwg3343.pass.cpp
+++ libcxx/test/std/thread/thread.condition/notify_all_at_thread_exit_lwg3343.pass.cpp
@@ -8,11 +8,15 @@
 //
 // UNSUPPORTED: no-threads
 
-// ALLOW_RETRIES: 3
-
 // notify_all_at_thread_exit(...) requires move semantics to transfer the unique_lock.
 // UNSUPPORTED: c++03
 
+// The fix of LWG3343 is done in the dylib. That means Apple backdeployment
+// targets remain broken. Due to the nature of the test, testing on a broken
+// system does not guarantee that the test fails, so the test can't use XFAIL.
+// UNSUPPORTED: use_system_cxx_lib && target={{.+}}-apple-macosx10.{{.+}}
+// UNSUPPORTED: use_system_cxx_lib && target={{.+}}-apple-macosx11.{{.+}}
+
 // This is a regression test for LWG3343.
 //
 // <condition_variable>
@@ -29,6 +33,8 @@
 #include <mutex>
 #include <thread>
 
+int condition_variable_lock_skipped_counter = 0;
+
 union X {
     X() : cv_() {}
     ~X() {}
@@ -46,11 +52,13 @@
 
     for (int i = 0; i < N; ++i) {
         std::thread t = support::make_test_thread([&] {
-            // Signal thread completion
+            // Emulate work being done.
+            std::this_thread::sleep_for(std::chrono::milliseconds(1));
+
+            // Signal thread completion.
             std::unique_lock<std::mutex> lk(m);
             --threads_active;
             std::notify_all_at_thread_exit(x.cv_, std::move(lk));
-            std::this_thread::sleep_for(std::chrono::milliseconds(1));
         });
         t.detach();
     }
@@ -62,6 +70,15 @@
     // unlocks `m`.
     {
         std::unique_lock<std::mutex> lk(m);
+        // Due to OS scheduling the workers might have terminated when this
+        // code is reached. In that case the wait will not sleep and the call
+        // to notify_all_at_thread_exit has no effect; the condition variable
+        // will not be used here.
+        //
+        // Keep track of how often that happens, when too often the test needs
+        // to be improved.
+        if(threads_active == 0)
+            ++condition_variable_lock_skipped_counter;
         x.cv_.wait(lk, [&]() { return threads_active == 0; });
     }
 
@@ -92,5 +109,9 @@
         test();
     }
 
+    // The threshold is arbitrary, it just makes sure the notification is
+    // tested a reasonable number of times.
+    assert(condition_variable_lock_skipped_counter < 250);
+
     return 0;
 }


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D143816.497726.patch
Type: text/x-patch
Size: 2593 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/libcxx-commits/attachments/20230215/6eabce49/attachment.bin>


More information about the libcxx-commits mailing list