[all-commits] [llvm/llvm-project] 39e1a9: [AMDGPU] SI_INDIRECT_DST_V* pseudos expansion sho...
alex-t via All-commits
all-commits at lists.llvm.org
Tue Mar 10 04:04:38 PDT 2020
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: 39e1a90784b32dbcfcad97dab3c33bb03bfcfcee
https://github.com/llvm/llvm-project/commit/39e1a90784b32dbcfcad97dab3c33bb03bfcfcee
Author: alex-t <alexander.timofeev at amd.com>
Date: 2020-03-10 (Tue, 10 Mar 2020)
Changed paths:
M llvm/lib/Target/AMDGPU/SIISelLowering.cpp
M llvm/test/CodeGen/AMDGPU/indirect-addressing-term.ll
Log Message:
-----------
[AMDGPU] SI_INDIRECT_DST_V* pseudos expansion should place EXEC restore to separate basic block
Summary:
When SI_INDIRECT_DST_V* pseudos has indexes in VGPR, they get expanded into the self-looped basic block that modifies EXEC in a loop.
To keep EXEC consistent it is stored before and then re-stored after the pseudo expansion result.
%95:vreg_512 = SI_INDIRECT_DST_V16 %93:vreg_512(tied-def 0), %94:sreg_32, 0, killed %1500:vgpr_32
results to
s_mov_b64 s[6:7], exec
BB0_16:
v_readfirstlane_b32 s8, v28
v_cmp_eq_u32_e32 vcc, s8, v28
s_and_saveexec_b64 vcc, vcc
s_set_gpr_idx_on s8, gpr_idx(DST)
v_mov_b32_e32 v6, v25
s_set_gpr_idx_off
s_xor_b64 exec, exec, vcc
s_cbranch_execnz BB0_16
; %bb.17:
s_mov_b64 exec, s[6:7]
The bug appeared in case this expansion occurs in the ELSE block of the CF.
Originally
%110:vreg_512 = SI_INDIRECT_DST_V16 %103:vreg_512(tied-def 0), %85:vgpr_32, 0, %107:vgpr_32,
%112:sreg_64 = SI_ELSE %108:sreg_64, %bb.19, 0, implicit-def dead $exec, implicit-def dead $scc, implicit $exec
expanded to
****************** <== here exec has "THEN" context
s_mov_b64 s[6:7], exec
BB0_16:
v_readfirstlane_b32 s8, v28
v_cmp_eq_u32_e32 vcc, s8, v28
s_and_saveexec_b64 vcc, vcc
s_set_gpr_idx_on s8, gpr_idx(DST)
v_mov_b32_e32 v6, v25
s_set_gpr_idx_off
s_xor_b64 exec, exec, vcc
s_cbranch_execnz BB0_16
; %bb.17:
s_or_saveexec_b64 s[4:5], s[4:5] <-- exec mask is restored for "ELSE" but immediately overwritten.
s_mov_b64 exec, s[6:7]
The rest of the "ELSE" block is executed not by the workitems which constitute the "else mask" but by those which constitute "then mask"
SILowerControlFlow::emitElse always considers the basic block begin() as an insertion point for s_or_saveexec.
Proposed fix: The SI_INDIRECT_DST_V* procedure should split the reminder block to create landing pad for the EXEC restoration.
Reviewers: rampitec, vpykhtin, nhaehnle
Reviewed By: vpykhtin
Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, hiraditya, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75472
More information about the All-commits
mailing list