[PATCH] D83214: [AMDGPU][GlobalISel] Select llvm.amdgcn.ballot
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jul 10 10:34:41 PDT 2020
arsenm accepted this revision.
arsenm added inline comments.
This revision is now accepted and ready to land.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp:1053-1054
+
+ Optional<ValueAndVReg> Arg =
+ getConstantVRegValWithLookThrough(I.getOperand(2).getReg(), *MRI, true);
+
----------------
mbrkusanin wrote:
> arsenm wrote:
> > I think you want just regular getConstantVRegVal. I don't think you're getting much from the look through
> Unfortunately regular version fails to produce the value.
>
I think something probably went wrong here due to us not trying to do anything resembling optimization during/after RegBankSelect. When we do that, we can probably remove a lot of these
================
Comment at: llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.ballot.i64.ll:11-12
+; CHECK: ; %bb.0:
+; CHECK-NEXT: s_mov_b32 s0, 0
+; CHECK-NEXT: s_mov_b32 s1, 0
+; CHECK-NEXT: ; return to shader part epilog
----------------
mbrkusanin wrote:
> arsenm wrote:
> > This can be one s_mov_b64
> It can, but SIFoldOperands will not let that happen.
>
> From:
> %10:sreg_64 = S_MOV_B64 0
> %3:sreg_32 = COPY %10.sub0:sreg_64
> %4:sreg_32 = COPY %10.sub1:sreg_64
> plus some instructions that use %3, %4 but will eventually be removed.
>
> SIFoldOperands will produce:
> %10:sreg_64 = S_MOV_B64 0
> %3:sreg_32 = S_MOV_B32 0
> %4:sreg_32 = S_MOV_B32 0
> ...
>
> which makes the first instruction dead and in the end we're left with two S_MOV_B32.
>
> For example bellow with exec, AMDGPU::sub0_sub1 seems to do the trick but I don't see anything similar for immediate opreands.
> Alternatively we can produce
> v_cmp_ne_u32_e64 s[0:1], 0, 0
> if for whatever reason that is more preferable then
> s_mov_b32 s0, 0
> s_mov_b32 s1, 0
>
> Anyway, this is not an issue with selecting ballot. Following example has the same issue:
>
> ```
> define amdgpu_cs i64 @si_fold_constants_i64() {
> %x = add i64 0, 0
> ret i64 %x
> }
> ```
I guess this is another bug to solve. Can you file that somewhere? We shouldn't be trying to workaround it in the selector
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D83214/new/
https://reviews.llvm.org/D83214
More information about the llvm-commits
mailing list