[PATCH] D52846: [AMDGPU] Add FixupVectorISel pass, currently Supports SREGs in GLOBAL LD/ST
Ron Lieberman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 5 13:34:16 PDT 2018
ronlieb marked 10 inline comments as done.
ronlieb added a comment.
Matt, thanks for review, I have addressed most of the issues although the inreg/sgpr one may require some more investigation on my end.
see my responses to each of your review comments.
A revision to this patch will be uploaded in the next hour or so...
================
Comment at: lib/Target/AMDGPU/SIFixupVectorISel.cpp:47-51
+// Opt used by a few lit tests to preserve older CHECK patterns.
+static cl::opt<bool> EnableSIGlobalSAddr(
+ "amdgpu-enable-si-global-saddr",
+ cl::desc("Enable use of SGPR regs for global load/store instructions"),
+ cl::init(true));
----------------
arsenm wrote:
> I'd rather not have this option. We have an already excessively long list of band-aids to avoid updating tests
option removed, and tests altered to not reference the now non existent flag.
================
Comment at: lib/Target/AMDGPU/SIFixupVectorISel.cpp:100-101
+ continue;
+ MachineInstr *DefInst = MRI.getUniqueVRegDef(WOp->getReg());
+ switch (DefInst->getOpcode()) {
+ default:
----------------
arsenm wrote:
> I think you could break this with a shader using an inreg/sgpr argument
I tried to construct a test that exercises this path, its in the next revision of this patch in a test named global-saddr-misc.ll
================
Comment at: lib/Target/AMDGPU/SIFixupVectorISel.cpp:107-110
+ case AMDGPU::REG_SEQUENCE:
+ Worklist.push_back(&DefInst->getOperand(1));
+ Worklist.push_back(&DefInst->getOperand(3));
+ break;
----------------
arsenm wrote:
> This is assuming too much about the REG_SEQUENCE, so at least needs an assert. It could be a wider reg_sequence
assert added
================
Comment at: lib/Target/AMDGPU/SIFixupVectorISel.cpp:112-117
+ BaseReg = DefInst->getOperand(2).getReg();
+ if (!TargetRegisterInfo::isVirtualRegister(BaseReg))
+ continue;
+ IndexReg = DefInst->getOperand(3).getReg();
+ if (!TargetRegisterInfo::isVirtualRegister(IndexReg))
+ continue;
----------------
arsenm wrote:
> You don't need these. We'll never emit a physical register operand like this. I might worry about seeing a subreg index on the use operand though
changed to check for subreg.
================
Comment at: test/CodeGen/AMDGPU/conv2d-saddr.ll:16
+; Function Attrs: convergent nounwind
+define hidden amdgpu_kernel void @simpleConv2d(i32 addrspace(1)* nocapture %dst_image, i32 addrspace(1)* nocapture readonly %src_image, i32 addrspace(1)* nocapture readonly %conv_kernel) local_unnamed_addr #0 !kernel_arg_addr_space !5 !kernel_arg_access_qual !6 !kernel_arg_type !7 !kernel_arg_base_type !7 !kernel_arg_type_qual !8 {
+entry:
----------------
arsenm wrote:
> This testcase is way too big. Most of the tests should be only a handful of instructions. There should be ones stressing the immediate limits. I also don't see ones for the atomics that the comments say work
reduced test case to 47 lines total.
https://reviews.llvm.org/D52846
More information about the llvm-commits
mailing list