[PATCH] D131959: [AMDGPU] Fix SDST operand of V_DIV_SCALE to always be VCC
Pierre van Houtryve via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 1 07:25:18 PDT 2022
Pierre-vh added a comment.
In D131959#3763962 <https://reviews.llvm.org/D131959#3763962>, @foad wrote:
> Well, I don't fully understand the condition on line 712:
>
> if (!SrcReg.isVirtual() || (!isLaneMaskReg(SrcReg) && !isVreg1(SrcReg)))
>
> I'm not sure what kind of copies from physical SGPRs this pass is (was) expecting to see here. I suppose an instruction like this could have two possible interpretations:
>
> %0:vreg_1 = COPY $sgpr0
>
> 1. Each bit of sgpr0 supplies an i1 value for the corresponding lane of the result.
> 2. The whole of sgpr0 is either 0 or 1, and that value should be broadcast to every lane of the result.
>
> Stepping back a bit, it feels like you are running into problems because you have a def and a use of vcc, where one of them is implicit and the other is explicit. If def and use were both implicit or both explicit then I think things might work better. Is there a way you can change v_div_scale so that the vcc operand is explicit?
I initially wanted to do just that, but IIRC @arsenm told me that it's not a good idea to force a output operand to be a specific register by constraining it with a RegClass, instead, implicit definitions should be used. I also remember starting out this patch just like that but I ran into a lot more issues, especially with the register allocator.
I quickly checked and `check-llvm-codegen-amdgpu` passes if I skip lowering Copies to i1 for VCC/VCC_LO. Should we just do that? Not sure if it makes much sense to lower such copies.
Then the pass would be very minimal, just fixing up the copy, but I think it might even be possible to remove it entirely in that case.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D131959/new/
https://reviews.llvm.org/D131959
More information about the llvm-commits
mailing list