[PATCH] D131959: [AMDGPU] Fix SDST operand of V_DIV_SCALE to always be VCC

Fri Sep 2 01:44:53 PDT 2022

Pierre-vh added a comment.

In D131959#3764069 <https://reviews.llvm.org/D131959#3764069>, @foad wrote:

>> I quickly checked and `check-llvm-codegen-amdgpu` passes if I skip lowering Copies to i1 for VCC/VCC_LO. Should we just do that?
>
> My gut feeling is that that's not safe because VCC/VCC_LO should be treated the same as any other SGPRs. But really I'm out of my depth here and I don't know what the correct solution is.

I guess it depends on what the pass is trying to achieve. With my current understanding of how this all works, VCC is already a lane mask + it's uniform across the wave since it's a SGPR under the hood, so it shouldn't need special treatment to be passed around, no?

The pass uses `V_CMP_NE_U32` to do the lowering/copy, which does:

  D.u64[threadId] = (S0 <> S1).

So for `v_cmp_ne_u64_e64 vcc, vcc, 0`, this would just set the mask to all ones or zeroes depending on whether VCC is all zeroes, no? Then it doesn't make sense because it doesn't COPY VCC but instead changes it.

I'm also out of my depth here, I'm trying to piece this together but my current understanding would be that it doesn't make sense to do this lowering for copies of physical SGPRs to i1

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131959/new/

https://reviews.llvm.org/D131959