[llvm] [AMDGPU] Skip terminators when forcing emit zero flag (PR #112116)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 14 06:56:24 PDT 2024
jayfoad wrote:
> > In the test case, why does SIInsertWaitcnts want to put a wait between the two branch instructions?
>
> Because we force emit zero via `amdgpu-waitcnt-forcezero`. That is a debug switch, but didn't work properly in some cases.
Oh, I see. That option is weird. I thought it meant "if any waitcnt is required, wait for the counter to be 0". But actually it waits for the counter to be zero after every instruction, even instructions that have nothing to do with wait counts.
https://github.com/llvm/llvm-project/pull/112116
More information about the llvm-commits
mailing list