[PATCH] D94746: [AMDGPU] Move kill lowering to WQM pass and add live mask tracking
Carl Ritson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 15 18:22:54 PST 2021
critson added inline comments.
================
Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.kill.ll:87
+; GCN: v_cmp_lg_f32
define amdgpu_gs void @oeq(float %a) {
%c1 = fcmp oeq float %a, 0.0
----------------
piotr wrote:
> The generated code for this test (and a few others) is slightly unexpected (all three patches combined):
>
> Before:
> v_cmpx_lt_f32_e32 vcc, 0, v0
>
> After:
> v_cmp_gt_f32_e32 vcc, 0, v0
> s_andn2_b64 exec, exec, vcc
> s_andn2_b64 exec, exec, vcc
So what is happening is the mask update and the exec update use the same register, and shader is marked GS.
Post WQM:
```
// live mask generated:
%3:sreg_64 = COPY $exec
// kill:
%0:vgpr_32 = COPY $vgpr0
V_CMP_GT_F32_e32 0, %0:vgpr_32, implicit-def $vcc, implicit $mode, implicit $exec
// live mask update:
dead %3:sreg_64 = S_ANDN2_B64 %3:sreg_64, $vcc, implicit-def $scc
SI_EARLY_TERMINATE_SCC0 implicit $exec, implicit $scc
// kill implemented:
$exec = S_ANDN2_B64 $exec, $vcc, implicit-def $scc
```
Here the SI_EARLY_TERMINATE_SCC0 generates nothing because the test shader is marked amdgpu_gs. I think the test shaders are too trivial to be representative of real code generation. I added the sendmsg otherwise these shaders optimise away to nothing. It still could be reasonable to add a peephole to clean these up.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D94746/new/
https://reviews.llvm.org/D94746
More information about the llvm-commits
mailing list