<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/76057>76057</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            TestGlobalModuleCache.py test is flaky on 32 bit Arm Linux
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          DavidSpickett
      </td>
    </tr>
</table>

<pre>
    This test was added by https://github.com/llvm/llvm-project/pull/74894 and was flaky on Arm and AArch64, though I have only since seen it fail on Arm. So I assume it's just a lot more common there.

The failure looks like:
```
intern-state     ^^^^^^^^ Thread::ShouldStop Begin ^^^^^^^^
intern-state     Plan stack initial state:
  thread #1: tid = 0x1213e8:
    Active plan stack:
      Element 0: Base thread plan.
 Element 1: Single stepping past breakpoint site 11 at 0xf7fc0c14

python3.10       Discarding thread plans for thread (tid = 0x1213e8, force 1)
intern-state     Plan Step over breakpoint trap should stop: 0.
intern-state     Completed step over breakpoint plan.
intern-state     Plan Step over breakpoint trap auto-continue: true.
intern-state     ^^^^^^^^ Thread::ShouldStop plan stack before PopPlan ^^^^^^^^
intern-state       thread #1: tid = 0x1213e8:
 Active plan stack:
      Element 0: Base thread plan.
    Discarded plan stack:
      Element 0: Single stepping past breakpoint site 11 at 0xf7fc0c14

python3.10: /home/david.spickett/llvm-project/lldb/source/Target/ThreadPlanStack.cpp:151: lldb::ThreadPlanSP lldb_private::ThreadPlanStack::PopPlan(): Assertion `m_plans.size() > 1 && "Can't pop the base thread plan"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
#0 0xf174a0c0 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) Signals.cpp:0:0
#1 0xf1747b24 llvm::sys::RunSignalHandlers() Signals.cpp:0:0
#2 0xf174a968 SignalHandler(int) Signals.cpp:0:0
#3 0xf774d6e0 __default_sa_restorer ./signal/../sysdeps/unix/sysv/linux/arm/sigrestorer.S:67:0
#4 0xf773db06 ./csu/../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47:0
#5 0xf777d2ca __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
#6 0xf774c840 gsignal ./signal/../sysdeps/posix/raise.c:27:6
```

Note that this build included the logging I added in https://github.com/llvm/llvm-project/commit/d14d52158bc444e2d036067305cf54aeea7c9edb, after narrowing down the crash to one particular call to `PopPlan`.

Initially I could not reproduce it but have now been able to do so, though it takes a long time to fail.

The basic problem is that somehow `Thread::ShouldStop` asks for the current plan when the stack consists of the base plan and the single step plan. Before it can decide that the step has finished and should be popped, a call is made in the test to destroy the debugger.

This follows this call chain:
* `Debugger::Destroy`
* `Debugger::Clear`
* `Process::Finalize`
* `Process::DestroyImpl`
* `ThreadList::DiscardThreadPlans`
* `Thread::DiscardThreadPlans`

Which is why we see  `Discarding thread plans for thread (tid = 0x1213e8, force 1)` in the log output.

Discarding the thread plans leaves only the base plan on the stack (the stack *always* has the base plan no matter what). So when `thread::ShouldStop` decides to pop the single step plan, it's no longer on the stack. It's a time of read/time of use issue, except `ShouldStop` wasn't written with it in mind that the plan stack would change during it at all.

This means that `PopPlan` tries to pop the base plan, which asserts to tell us we can't do that.

The overall issue seems to be one of destruction order. Or at least, something isn't telling the threads to stop before we start destroying the process. The threads I think are destroyed later in `Process::Finalize`, and I think there's a potential bug there too.
```
  m_thread_plans.Clear();
 m_thread_list_real.Destroy();
  m_thread_list.Destroy();
 m_extended_thread_list.Destroy();
```
The thread plans are cleared before the threads that would be looking at them are destroyed. That's not the cause of this particular assert, but it's suspicious to me at least.

The underlying problem is probably whatever is letting `thread::ShouldStop` run, even though we're in the process of destroying the `Process` that contains them.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysWF1v4zrO_jXuDTGBLSdOe5GLTDt53wJndwfbAnsZyBIT60SWDIluJvvrF5Sdr37t6WKKNK1tiqTIh-QjyxjN1iEustn3bPZwI3tqfFg8yBejnzqjdkh0U3t9WDw3JgJhJNjLCFJr1FAfoCHqYlYuM7HKxGprqOnrifJtJlbWvhz_fOuC_xMVZWLV9dZmYjWf3t5NQTqd1G2s3B3AO1iGNt1cLoNqqmkm7oEa328beIRGviB4Zw8QjVMIEdGBIdhIY8e1E3jy8Agyxr5FMJSJeYQ_-0ggwXqC1gcE5dvWO6AGA06y_CHLl8P3c4NJWR8QrPe7CNbskDc3CFX5-EmXxhEG9y2SJAT-yWY_3v3AcxNQatZTLp8a31v9RL6D77g17sNVHxj5aaWDSFLtwDhDRlpID09eAlAyB5koi6xcAhkNWfkA-a9CFCXeXkgCLBWZF4TupPXqKcAPiy06gpw1fZcRj9p5xRi8k1Ay92Tc1iJEwq4zbgudjAR1QLnrvHEE0RBCUYAkyH9t5huVq2J6mYXuQI135aTIBxfgwUQlg2ZlF8YjbHw47_X2zTbFPUsohCITd59F84mwA_-C4dJNCrKDmJIFkXzHW8snH6i5921nkVCnbb_RdRGrr3oge_LflHdkXI8pm6HHj3R9FYDnrEONGy6Nn75LDn0RlF_A3G8B3BkUqP-iqt8IS1aXiVXjW8zESnOrnMSxV77td9bqOhOr6PugWP5Zhi3ygyEpHO4ndn6iOgZZMUsBTKtSvi7Efqbb6y6Yl7Hirx4fQ1AuxzRm4pahXy5hGSMGMt5BVuXtOtXPJJp_4yACWfkDCshElYkKMiHuefGcoPMd90moXyUiEyIT89QsUY9Z-fnHj-XTD4h93Rrut3W_hYCdDwTkvz4nTIw9xkys0jwwTtleY3JGBRkbqKXaUZDq1MJFmXPqivlU5iqHpDVFIx7iGJZgHKUwPfPCTNyehYLcr32kgLJNYbgH44hD82S2Tto4pidPv0eDxWhwXovpewb_2bth-f9Lpy2GOIb7M53iuIm76hauVmfi9i_4VCb8zqe6whzWa40b2VtaR7kOGMkHDDBhPCYVmVhN0tUhauw42r0zv4YbL5wT43q-lKEd1hx1TJ6yclnNryxPB8ulrvMq2VCx_5oBa2r1Tftv8RCVtDYZmV4bmQ1G5looCet1N4ByvTPWrg13YS55maDOdl1HvMlLsYlirdOsXM6rk9pqjJq6neawHYLzaZw6H9M-gjQRk0rBjlbvUoXh---eGMCSgJhL1b2xJ2DrhGzrt1tuTo8jvzLu63XD7IaJz0oXUz0Txey2VtPpFIXOyyqv5mU-U5vZVCLKubpDbk73IDeEAZwMwe_ZAe337qLWyIN3CJ0MZFRvZQBOD9_OqvzYbKr8ikw9DuzEHuARVBqjzhM3hOB1r5icQd3TQOqc30PNbE7WFlmt9hD9BfkzBCR3GBONYx5g2iTHDegNhatlNAq64GuLLTBr5ZhH32Lj9-zx--Mwq3KQcXfkFQiqDwHH8Q37BoeIDPNSeRdNpAh-c26QSZC7VZI7D5xhdsH3YcYaAiUdaFRGn_AwCjbMhY0zsUGdNI0MpEZuxR3qlKwh-iZCKzUySHh9ouYcOowU_CHd01j32y2GVyEyvEdr_T4OSEzqVCONO3NdseRIPYwKhlg9DKrPsH5P6N6iDK9FfgavMI5dcWWctDx7PhMabT22nX0tN6TvDxNpFB2owHkSxvdX_Hfp9P2vxqiGo7tvDrBPxwxI2_wdNLTKj_myfgu-p66nq-xcWcFrSxblC8bhCHQNOn-JTfbi4mIp7V4eIoeC4XW90HloJXHx7xvJsyWdoBLYsyqnDwtlQG9kwB05wmvApxk6nMCcT2WL4crRCTwOj-VQz34DyZxYHS_7iJB4AOvCXwo7YreuXdnLOJCVfTBE6GBvKHUM46A1qRrHErvgu_tUVqqRboug-8DxZtZCwHPnTbm0yPFPiq46HlAw11E4hZZd3icoyUS-khShtdBHhpUaKZb2Se-bLsbHgaHMY59A2CYNNaZW7DdDofcqjTofNIYJ_CPwDizKSGyeWx41aWtjiNj-NbSSUj7gHI8A-5SdQMdGchTvhuqcwPPF2kduIG4HMuBRHjVYyYAy7tPK50bm9ElDOo2PaOg8oUtHW-aQ6QmQ95N3ZytAux6n-0BrhwY0Ut_vo9BJxppI64DSTo7d7JXktehHUu0afxE6jfovSL_y-Pl1XXP0FHuN-piFqwQx7PbHOWC933FOBky316Hn7MhjzQ2gV5KrKI0pEy9H-ABLzgPP4bFSY88nGeP7hIsWT3h6g8_eaQw2weNi1PK_sraH1E6QT7SG-xYRy33aUUKfSgZf0pxNU3_PgAinCTci8IT9MzYvcMY1yfHiM7M0qWixndzoRanvyjt5g4tinpf5_O4uFzfNQhX6VhSbGtWmnku5wWp2V6h8Woi5Empe3JiFyEVZCJEX5XSWi4kqpvquynNR5KW-1ZtsmmPLNIR52MSH7U2q2MW8ymfzGytrtPH4di0sElmr-23MpjkjJp6XkSGLi2eM9H_W19L-zeve4r1UDU66wzDfzcWrslJAbSi9MfuDWfRNH-zifz9mJX__EwAA___cBnJh">