[PATCH] D31993: [AMDGPU] Combine DS operations with offsets bigger than byte
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 13 11:11:30 PDT 2017
arsenm added inline comments.
================
Comment at: llvm/trunk/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:387-389
+ *BuildMI(*MBB, CI.Paired, DL, TII->get(AMDGPU::V_ADD_I32_e32), BaseReg)
+ .addImm(CI.BaseOff)
+ .addReg(AddrReg->getReg());
----------------
This should use the e64 version with an unused carry. We should add a helper to TII to emit this since it will change with GFX9
================
Comment at: llvm/trunk/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:392
+
+ MachineInstrBuilder Read2 = BuildMI(*MBB, CI.Paired, DL, Read2Desc, DestReg)
+ .addReg(BaseReg, BaseRegFlags) // addr
----------------
BuildMI on next line and shift the .adds to the left
================
Comment at: llvm/trunk/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:397-398
+ .addImm(0) // gds
+ .addMemOperand(*CI.I->memoperands_begin())
+ .addMemOperand(*CI.Paired->memoperands_begin());
(void)Read2;
----------------
.setMemRefs
================
Comment at: llvm/trunk/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:459-460
+ BaseRegFlags = RegState::Kill;
+ *BuildMI(*MBB, CI.Paired, DL, TII->get(AMDGPU::V_ADD_I32_e32), BaseReg)
+ .addImm(CI.BaseOff)
+ .addReg(Addr->getReg());
----------------
Same as with the other add
================
Comment at: llvm/trunk/test/CodeGen/AMDGPU/ds-combine-large-stride.ll:1
+; RUN: llc -march=amdgcn -mtriple=amdgcn--amdhsa -verify-machineinstrs < %s | FileCheck %s
+
----------------
-march is redundant with the triple. Should also use GCN check prefix since the check lines will need to change with GFX9 (should also add a run line for it so it breaks when the add change is made)
================
Comment at: llvm/trunk/test/CodeGen/AMDGPU/ds-combine-large-stride.ll:370
+; CHECK-DAG: ds_write2st64_b64 [[B3]], v[{{[0-9]+:[0-9]+}}], v[{{[0-9]+:[0-9]+}}] offset1:16
+define void @ds_write64_combine_stride_8192_shifted(double addrspace(3)* nocapture %arg) {
+bb:
----------------
These all need to be marked amdgpu_kernel
Repository:
rL LLVM
https://reviews.llvm.org/D31993
More information about the llvm-commits
mailing list