[PATCH] D15425: AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF instructions
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 10 11:10:17 PST 2015
arsenm added inline comments.
================
Comment at: lib/Target/AMDGPU/AMDGPUAnnotateUniformValues.cpp:37-39
@@ +36,5 @@
+
+ void setAllBranchesUniform(bool Uniform) {
+ if (AllBranchesUniform) *AllBranchesUniform = Uniform;
+ }
+
----------------
This + AllBranchesUniform don't seem to be used anywhere
================
Comment at: lib/Target/AMDGPU/AMDGPUAnnotateUniformValues.cpp:87
@@ +86,3 @@
+
+ return false;
+}
----------------
Should return true as the conservative default if not recording when changes are actually made
================
Comment at: lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:282-283
@@ -280,2 +281,4 @@
addPass(createSITypeRewriter());
addPass(createSIAnnotateControlFlowPass());
+ addPass(createAMDGPUAnnotateUniformValues());
+
----------------
Why is this run after the control flow annotate pass?
================
Comment at: lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:95-96
@@ +94,4 @@
+
+static bool isIntrinsicSourceOfDivergence(const TargetIntrinsicInfo *TII,
+ const IntrinsicInst *I) {
+ switch (I->getIntrinsicID()) {
----------------
I think the implementation of isSourceOfDivergence should be committed as a separate patch
================
Comment at: lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:111-118
@@ +110,10 @@
+
+ StringRef Name = I->getCalledFunction()->getName();
+ switch (TII->lookupName((const char *)Name.bytes_begin(), Name.size())) {
+ default:
+ return false;
+ case AMDGPUIntrinsic::SI_tid:
+ case AMDGPUIntrinsic::SI_vs_load_input:
+ case AMDGPUIntrinsic::SI_fs_constant:
+ case AMDGPUIntrinsic::SI_fs_interp:
+ return true;
----------------
I think we should just move these into the public header so we get the enum IDs. I think mesa already directly uses these, so it's where they belong anyway.
================
Comment at: lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:119
@@ +118,3 @@
+ case AMDGPUIntrinsic::SI_fs_interp:
+ return true;
+ }
----------------
Do image/sample instructions need to be considered divergent? We also need some of the cross lane intrinsics but I'm not sure we have those yet
================
Comment at: lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:125
@@ +124,3 @@
+/// \returns true if the result of the value could potentially be
+/// different across threads.
+bool AMDGPUTTIImpl::isSourceOfDivergence(const Value *V) const {
----------------
s/threads/workitems in a wavefront/
================
Comment at: lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:137-138
@@ +136,4 @@
+
+ if (F->getAttributes().hasAttribute(A->getArgNo(), Attribute::InReg))
+ return false;
+
----------------
I think this should be split into a function called something like isArgPassedInSGPR with a comment about how inreg is interpreted
================
Comment at: lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:141-142
@@ +140,4 @@
+ LLVMContext &Ctx = V->getContext();
+ return V->getType() == Type::getInt32Ty(Ctx) ||
+ V->getType() == Type::getFloatTy(Ctx);
+
----------------
Ty->isInt32Ty() / isFloatTy() would be better
================
Comment at: lib/Target/AMDGPU/SIISelLowering.cpp:514
@@ +513,3 @@
+ // UndefValue means this is a load of a kernel input. These are uniform.
+ if (isa<UndefValue>(Ptr) || isa<Argument>(Ptr))
+ return true;
----------------
This maybe should check constant isa<Constant> which happens occasionally for LDS
================
Comment at: lib/Target/AMDGPU/SIInstructions.td:2111
@@ +2110,3 @@
+// selector to prefer those.
+let AddedComplexity = 10000 in {
+
----------------
A smaller number would be better. 10 is probably more than enough
http://reviews.llvm.org/D15425
More information about the llvm-commits
mailing list