[compiler-rt] [asan][test] Fix flake in asan_lsan_deadlock test (PR #141145)

Thu May 22 16:17:14 PDT 2025

ilovepi wrote:

@Camsyn I'm not sure I follow. The failures in our bots don't appear to be due to a timeout. I see some of those on the public build bot, but not on our CI https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8714404311534072321/overview (logs https://logs.chromium.org/logs/fuchsia/buildbucket/cr-buildbucket/8714404311534072321/+/u/clang/test/stdout).

I also wonder about a few things after looking at this test.

 1) Is your reasoning about SUMMARY reporting being a reliable signal correct?
    - I'm not sure it is, given that I have bots that don't seem to timeout in this test (at least I don't see that in the log).
 2) How do you ensure lock acquisitions overlap? 
    - I don't think there is enough synchronization for you to guarantee any ordering. Depending on scheduling I could believe the thread you launch could finish before the main thread tries to enter its critical section. What happens then?
    - You probably need some kind of coordination between the threads, like maybe `std::latch` w/ `jthread`? 

The thread that is doing the OOB access probably needs to be parked via std::latch until you're ready to start the critical sections. Otherwise I don't  see what prevents the OOB thread from immediately completing in which case, you won't exercise the deadlock scenario. In fact, I wonder if the test just usually *doesn't* test the deadlock scenario because the launched thread completes quickly and then the summary is reported per expectation.

In any case, I don't think this test is giving a useful signal, so I guess we should disable it until it can be fixed. 

https://github.com/llvm/llvm-project/pull/141145