[PATCH] D52846: [AMDGPU] Add FixupVectorISel pass, currently Supports SREGs in GLOBAL LD/ST
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Oct 4 02:27:09 PDT 2018
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/FLATInstructions.td:179
let is_flat_global = 1 in {
- def "" : FLAT_Load_Pseudo<opName, regClass, HasTiedInput, 1, 1>;
- def _SADDR : FLAT_Load_Pseudo<opName, regClass, HasTiedInput, 1, 1, 1>;
+ def "" : FLAT_Load_Pseudo<opName, regClass, HasTiedInput, 1, 1>, GlobalSaddrTable<0, opName>;
+ def _SADDR : FLAT_Load_Pseudo<opName, regClass, HasTiedInput, 1, 1, 1>, GlobalSaddrTable<1, opName>;
----------------
Put these on the following line and indent
================
Comment at: lib/Target/AMDGPU/SIFixupVectorISel.cpp:47-51
+// Opt used by a few lit tests to preserve older CHECK patterns.
+static cl::opt<bool> EnableSIGlobalSAddr(
+ "amdgpu-enable-si-global-saddr",
+ cl::desc("Enable use of SGPR regs for global load/store instructions"),
+ cl::init(true));
----------------
I'd rather not have this option. We have an already excessively long list of band-aids to avoid updating tests
================
Comment at: lib/Target/AMDGPU/SIFixupVectorISel.cpp:100-101
+ continue;
+ MachineInstr *DefInst = MRI.getUniqueVRegDef(WOp->getReg());
+ switch (DefInst->getOpcode()) {
+ default:
----------------
I think you could break this with a shader using an inreg/sgpr argument
================
Comment at: lib/Target/AMDGPU/SIFixupVectorISel.cpp:107-110
+ case AMDGPU::REG_SEQUENCE:
+ Worklist.push_back(&DefInst->getOperand(1));
+ Worklist.push_back(&DefInst->getOperand(3));
+ break;
----------------
This is assuming too much about the REG_SEQUENCE, so at least needs an assert. It could be a wider reg_sequence
================
Comment at: lib/Target/AMDGPU/SIFixupVectorISel.cpp:112-117
+ BaseReg = DefInst->getOperand(2).getReg();
+ if (!TargetRegisterInfo::isVirtualRegister(BaseReg))
+ continue;
+ IndexReg = DefInst->getOperand(3).getReg();
+ if (!TargetRegisterInfo::isVirtualRegister(IndexReg))
+ continue;
----------------
You don't need these. We'll never emit a physical register operand like this. I might worry about seeing a subreg index on the use operand though
================
Comment at: lib/Target/AMDGPU/SIFixupVectorISel.cpp:125
+ MI = MRI.getUniqueVRegDef(BaseReg);
+ if (!(MI && MI->isCopy()))
+ continue;
----------------
de Morgan this
================
Comment at: lib/Target/AMDGPU/SIFixupVectorISel.cpp:207
+bool SIFixupVectorISel::runOnMachineFunction(MachineFunction &MF) {
+ // This pass does not run at -O0, avoid putting correctness patches here.
+ if (skipFunction(MF.getFunction()))
----------------
Comment not needed
================
Comment at: lib/Target/AMDGPU/SIFixupVectorISel.cpp:217-218
+ bool FuncModified = false;
+ for (MachineFunction::iterator BI = MF.begin(), BE = MF.end();
+ BI != BE; ++BI) {
+ // Cleanup missed Saddr opportunites from ISel.
----------------
Range loop
================
Comment at: test/CodeGen/AMDGPU/conv2d-saddr.ll:13
+
+target triple = "amdgcn-unknown-amdhsa"
+
----------------
Move the triple to the command line and drop this
================
Comment at: test/CodeGen/AMDGPU/conv2d-saddr.ll:16
+; Function Attrs: convergent nounwind
+define hidden amdgpu_kernel void @simpleConv2d(i32 addrspace(1)* nocapture %dst_image, i32 addrspace(1)* nocapture readonly %src_image, i32 addrspace(1)* nocapture readonly %conv_kernel) local_unnamed_addr #0 !kernel_arg_addr_space !5 !kernel_arg_access_qual !6 !kernel_arg_type !7 !kernel_arg_base_type !7 !kernel_arg_type_qual !8 {
+entry:
----------------
This testcase is way too big. Most of the tests should be only a handful of instructions. There should be ones stressing the immediate limits. I also don't see ones for the atomics that the comments say work
https://reviews.llvm.org/D52846
More information about the llvm-commits
mailing list