[PATCH] D121524: [AMDGPU] use scalar shift for SALU users in frame index elimination
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 14 10:17:05 PDT 2022
rampitec added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:2239
+ const TargetRegisterClass *RC = IsSALU ? &AMDGPU::SReg_32RegClass
+ : &AMDGPU::VGPR_32RegClass;
bool IsCopy = MI->getOpcode() == AMDGPU::V_MOV_B32_e32;
----------------
Formatting.
================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:2240
+ : &AMDGPU::VGPR_32RegClass;
bool IsCopy = MI->getOpcode() == AMDGPU::V_MOV_B32_e32;
Register ResultReg =
----------------
Can probably be another mov?
================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:2242
Register ResultReg =
IsCopy ? MI->getOperand(0).getReg()
+ : RS->scavengeRegister(RC, MI, 0);
----------------
Formatting.
================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:2248
+ unsigned OpCode =
+ IsSALU ? AMDGPU::S_LSHR_B32 : AMDGPU::V_LSHRREV_B32_e64;
// XXX - This never happens because of emergency scavenging slot at 0?
----------------
What if $scc is alive?
$scc shall also be marked dead-def.
================
Comment at: llvm/test/CodeGen/AMDGPU/frame-index.mir:1
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=kaveri -mattr=-promote-alloca -amdgpu-sroa=0 -verify-machineinstrs -run-pass=prologepilog -o - %s | FileCheck -check-prefix=GCN %s
+
----------------
No need to pass -mattr=-promote-alloca -amdgpu-sroa=0.
================
Comment at: llvm/test/CodeGen/AMDGPU/frame-index.mir:7
+
+ define void @func_add_constant_to_fi_divergent_i32() {
+ %alloca = alloca [2 x i32], align 4, addrspace(5)
----------------
IR is also not needed.
================
Comment at: llvm/test/CodeGen/AMDGPU/frame-index.mir:59
+ $m0 = S_MOV_B32 -1
+ DS_WRITE_B32 undef renamable $vgpr0, killed renamable $vgpr0, 0, 0, implicit $m0, implicit $exec :: (volatile store (s32) into `i32 addrspace(5)* addrspace(3)* undef`, addrspace 3)
+ S_SETPC_B64_return killed renamable $sgpr30_sgpr31
----------------
Just remove memory operand (part starting from ::) so you do not need IR.
================
Comment at: llvm/test/CodeGen/AMDGPU/frame-index.mir:88
+# GCN-LABEL: name: func_other_fi_user_non_inline_imm_offset_i32
+# GCN: $vcc_hi = S_LSHR_B32 $sgpr32, 6
+# GCN: $vcc_hi = S_ADD_I32 killed $vcc_hi, 512
----------------
Generate the checks. This one misses inlicit-def dead $scc.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D121524/new/
https://reviews.llvm.org/D121524
More information about the llvm-commits
mailing list