[PATCH] D131959: [AMDGPU] Fix SDST operand of V_DIV_SCALE to always be VCC
Pierre van Houtryve via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 2 01:44:53 PDT 2022
Pierre-vh added a comment.
In D131959#3764069 <https://reviews.llvm.org/D131959#3764069>, @foad wrote:
>> I quickly checked and `check-llvm-codegen-amdgpu` passes if I skip lowering Copies to i1 for VCC/VCC_LO. Should we just do that?
>
> My gut feeling is that that's not safe because VCC/VCC_LO should be treated the same as any other SGPRs. But really I'm out of my depth here and I don't know what the correct solution is.
I guess it depends on what the pass is trying to achieve. With my current understanding of how this all works, VCC is already a lane mask + it's uniform across the wave since it's a SGPR under the hood, so it shouldn't need special treatment to be passed around, no?
The pass uses `V_CMP_NE_U32` to do the lowering/copy, which does:
D.u64[threadId] = (S0 <> S1).
So for `v_cmp_ne_u64_e64 vcc, vcc, 0`, this would just set the mask to all ones or zeroes depending on whether VCC is all zeroes, no? Then it doesn't make sense because it doesn't COPY VCC but instead changes it.
I'm also out of my depth here, I'm trying to piece this together but my current understanding would be that it doesn't make sense to do this lowering for copies of physical SGPRs to i1
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D131959/new/
https://reviews.llvm.org/D131959
More information about the llvm-commits
mailing list