[PATCH] D131959: [AMDGPU] Fix SDST operand of V_DIV_SCALE to always be VCC

Pierre van Houtryve via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 1 07:25:18 PDT 2022


Pierre-vh added a comment.

In D131959#3763962 <https://reviews.llvm.org/D131959#3763962>, @foad wrote:

> Well, I don't fully understand the condition on line 712:
>
>   if (!SrcReg.isVirtual() || (!isLaneMaskReg(SrcReg) && !isVreg1(SrcReg))) 
>
> I'm not sure what kind of copies from physical SGPRs this pass is (was) expecting to see here. I suppose an instruction like this could have two possible interpretations:
>
>   %0:vreg_1 = COPY $sgpr0
>
> 1. Each bit of sgpr0 supplies an i1 value for the corresponding lane of the result.
> 2. The whole of sgpr0 is either 0 or 1, and that value should be broadcast to every lane of the result.
>
> Stepping back a bit, it feels like you are running into problems because you have a def and a use of vcc, where one of them is implicit and the other is explicit. If def and use were both implicit or both explicit then I think things might work better. Is there a way you can change v_div_scale so that the vcc operand is explicit?

I initially wanted to do just that, but IIRC @arsenm told me that it's not a good idea to force a output operand to be a specific register by constraining it with a RegClass, instead, implicit definitions should be used. I also remember starting out this patch just like that but I ran into a lot more issues, especially with the register allocator.

I quickly checked and `check-llvm-codegen-amdgpu` passes if I skip lowering Copies to i1 for VCC/VCC_LO. Should we just do that? Not sure if it makes much sense to lower such copies.
Then the pass would be very minimal, just fixing up the copy, but I think it might even be possible to remove it entirely in that case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131959/new/

https://reviews.llvm.org/D131959



More information about the llvm-commits mailing list