[compiler-rt] [asan][test] Fix flake in asan_lsan_deadlock test (PR #141145)
Paul Kirth via llvm-commits
llvm-commits at lists.llvm.org
Thu May 22 16:17:14 PDT 2025
ilovepi wrote:
@Camsyn I'm not sure I follow. The failures in our bots don't appear to be due to a timeout. I see some of those on the public build bot, but not on our CI https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8714404311534072321/overview (logs https://logs.chromium.org/logs/fuchsia/buildbucket/cr-buildbucket/8714404311534072321/+/u/clang/test/stdout).
I also wonder about a few things after looking at this test.
1) Is your reasoning about SUMMARY reporting being a reliable signal correct?
- I'm not sure it is, given that I have bots that don't seem to timeout in this test (at least I don't see that in the log).
2) How do you ensure lock acquisitions overlap?
- I don't think there is enough synchronization for you to guarantee any ordering. Depending on scheduling I could believe the thread you launch could finish before the main thread tries to enter its critical section. What happens then?
- You probably need some kind of coordination between the threads, like maybe `std::latch` w/ `jthread`?
The thread that is doing the OOB access probably needs to be parked via std::latch until you're ready to start the critical sections. Otherwise I don't see what prevents the OOB thread from immediately completing in which case, you won't exercise the deadlock scenario. In fact, I wonder if the test just usually *doesn't* test the deadlock scenario because the launched thread completes quickly and then the summary is reported per expectation.
In any case, I don't think this test is giving a useful signal, so I guess we should disable it until it can be fixed.
https://github.com/llvm/llvm-project/pull/141145
More information about the llvm-commits
mailing list