[libcxx-commits] [PATCH] D99175: [libcxx] [test] Make the condvar wait_for tests less brittle

Tue Mar 23 07:09:56 PDT 2021

mstorsjo added inline comments.

================
Comment at: libcxx/test/std/thread/thread.condition/thread.condition.condvar/wait_for.pass.cpp:50
+           cv.wait_for(lk, wait_end - Clock::now()) == std::cv_status::no_timeout)
         ;
     Clock::time_point t1 = Clock::now();
----------------
Quuxplusone wrote:
> What happens with the subtraction when `wait_end < Clock::now()`?
> (This could happen because of a libc++ bug; or more likely it could happen because `wait_for` woke spuriously after 249.9ms and then it took us 0.2ms to get around to calling `Clock::now()` again.)
> 
> But also, there's no way this code should ever be intentionally waiting 250ms even once. The main thread is just waiting for line 44's `test1=1;` to happen, and then it //immediately// sets `test2=1` and notifies (line 76).
> 
> I think the only way that our line 48/49 can ever time out if if it entirely //misses// the main thread's notify due to not being asleep yet. Which actually probably happens a lot. But if the //point// of this test is to time out, then the main thread really shouldn't be notifying //at all ever// (and the timeout should probably be shorter, say 50ms). And if that's //not// the point of this test, then the main thread should be notifying only //after// `f` has gone to sleep, so the notify doesn't get missed. I think the straightforward way to do that is to swap lines 75+76 so that the main thread notifies under the lock. (That way, there's no semantic race condition where `f` spuriously wakes, then `main` notifies (without holding the lock) which races with `f`'s going back to sleep (under the lock).
Ok, regarding the subtraction, you're possibly right, but for the rest of the test, you're missing what's happening. Please back off and re-read the test. I would say it's mostly correct as is, except for the repeated timeouts.

Trying to respond to some of your points:

- You claim "Then the main thread immediately sets `test2=1` and notifies" No, it doesn't always. The main thread does two tests. First the main thread waits for the thread to start up (line 72), sets test2 (74) and notifies (76). Then it repeats the whole experiment, without setting test2 and without notifying (lines 81-90). This is the part of the test that actually exercises the timeouts.

- You claim "I think the only way that our line 48/49 can ever time out if if it entirely misses the main thread's notify due to not being asleep yet." No, it can't miss that, because the main thread first locks the mutex (68), then enters a wait loop (71-72). At this point the mutex is released, and the thread can proceed. Once the thread has proceeded past lines 44-45 (setting the test1 flag and notifying) and releasing the mutex while waiting in line 49, the main thread is unblocked and can proceed to set the flag. So there's no way that the thread can miss the notify in the way you describe.

- You claim "then the main thread should be notifying only after f has gone to sleep", well it does! The main thread and f-thread are mutually exclusive, and main waits at line 72 until the thread has reached its waiting point.

Now back to the topic of the test: How would you suggest to change the wait duration?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99175/new/

https://reviews.llvm.org/D99175