[PATCH] D121524: [AMDGPU] use scalar shift for SALU users in frame index elimination

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 14 10:17:05 PDT 2022


rampitec added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:2239
+        const TargetRegisterClass *RC = IsSALU ? &AMDGPU::SReg_32RegClass
+                                                   : &AMDGPU::VGPR_32RegClass;
         bool IsCopy = MI->getOpcode() == AMDGPU::V_MOV_B32_e32;
----------------
Formatting.


================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:2240
+                                                   : &AMDGPU::VGPR_32RegClass;
         bool IsCopy = MI->getOpcode() == AMDGPU::V_MOV_B32_e32;
         Register ResultReg =
----------------
Can probably be another mov?


================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:2242
         Register ResultReg =
             IsCopy ? MI->getOperand(0).getReg()
+                   : RS->scavengeRegister(RC, MI, 0);
----------------
Formatting.


================
Comment at: llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp:2248
+          unsigned OpCode =
+              IsSALU ? AMDGPU::S_LSHR_B32 : AMDGPU::V_LSHRREV_B32_e64;
           // XXX - This never happens because of emergency scavenging slot at 0?
----------------
What if $scc is alive?
$scc shall also be marked dead-def.


================
Comment at: llvm/test/CodeGen/AMDGPU/frame-index.mir:1
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=kaveri -mattr=-promote-alloca -amdgpu-sroa=0 -verify-machineinstrs -run-pass=prologepilog -o - %s | FileCheck -check-prefix=GCN %s
+
----------------
No need to pass -mattr=-promote-alloca -amdgpu-sroa=0.


================
Comment at: llvm/test/CodeGen/AMDGPU/frame-index.mir:7
+  
+  define void @func_add_constant_to_fi_divergent_i32() {
+    %alloca = alloca [2 x i32], align 4, addrspace(5)
----------------
IR is also not needed.


================
Comment at: llvm/test/CodeGen/AMDGPU/frame-index.mir:59
+    $m0 = S_MOV_B32 -1
+    DS_WRITE_B32 undef renamable $vgpr0, killed renamable $vgpr0, 0, 0, implicit $m0, implicit $exec :: (volatile store (s32) into `i32 addrspace(5)* addrspace(3)* undef`, addrspace 3)
+    S_SETPC_B64_return killed renamable $sgpr30_sgpr31
----------------
Just remove memory operand (part starting from ::) so you do not need IR.


================
Comment at: llvm/test/CodeGen/AMDGPU/frame-index.mir:88
+# GCN-LABEL: name: func_other_fi_user_non_inline_imm_offset_i32
+# GCN: $vcc_hi = S_LSHR_B32 $sgpr32, 6
+# GCN: $vcc_hi = S_ADD_I32 killed $vcc_hi, 512
----------------
Generate the checks. This one misses inlicit-def dead $scc.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121524/new/

https://reviews.llvm.org/D121524



More information about the llvm-commits mailing list