[PATCH] Divergence analysis for GPU programs
Bjarke Hammersholt Roune
bjarke.roune at gmail.com
Mon Mar 30 19:40:46 PDT 2015
================
Comment at: lib/Analysis/DivergenceAnalysis.cpp:176
@@ +175,3 @@
+ }
+ if (Cond == nullptr)
+ return;
----------------
Can Cond ever be null here, given that we only get to this point if *TI has been marked as a potentially divergent terminator instruction?
================
Comment at: lib/Analysis/DivergenceAnalysis.cpp:179
@@ +178,3 @@
+
+ // Since TI is divergent, Cond is also divergent. Per the definition of sync
+ // dependency, we mark all PHINodes in TI's immediate post dominator block as
----------------
(ignore - for some reason Phabricator won't let me delete this)
================
Comment at: lib/Analysis/DivergenceAnalysis.cpp:182
@@ +181,3 @@
+ // divergent.
+ BasicBlock *IPostDom = PDT.getNode(TI->getParent())->getIDom()->getBlock();
+ if (IPostDom == nullptr)
----------------
More phi nodes than these might need to be marked as diverging if diverging warps can be recognized by the hardware to converge at a point prior to the immediate post-dominator based on the (dynamic) path taken by each diverging subset of threads.
================
Comment at: lib/Analysis/DivergenceAnalysis.cpp:185
@@ +184,3 @@
+ return;
+ for (auto I = IPostDom->begin(); IPostDom->getFirstNonPHI() != I; ++I) {
+ if (Visited.insert(I).second)
----------------
It's better to only make one call to getFirstNonPHI(), since it runs in linear time, so this loop is otherwise quadratic in the number of phi nodes:
http://llvm.org/docs/doxygen/html/BasicBlock_8cpp_source.html#l00161
================
Comment at: lib/Analysis/DivergenceAnalysis.cpp:211
@@ +210,3 @@
+ exploreSyncDependency(TI);
+ }
+ exploreDataDependency(V);
----------------
(ignore - for some reason Phabricator won't let me delete this)
================
Comment at: lib/Analysis/DivergenceAnalysis.cpp:212
@@ +211,3 @@
+ }
+ exploreDataDependency(V);
+ }
----------------
Does any terminator instruction have a value? If not, I think this could be in an else branch.
================
Comment at: lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp:55
@@ +54,3 @@
+ if (isa<LoadInst>(I))
+ return true;
+ // Atomic instructions may cause divergence. Atomic instructions are
----------------
If all the threads in a warp load the same address at the same time, I think that they should all get the same value. If that's right, then the analysis would remain conservative by letting loads of non-divergent pointers yield non-divergent values, regardless of aliasing.
================
Comment at: lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp:58
@@ +57,3 @@
+ // executed sequentially across all threads in a warp. Therefore, an earlier
+ // executed thread may see different memory inputs than an later executed
+ // thread. For example, suppose *a = 0 initially.
----------------
an -> a
================
Comment at: lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp:68
@@ +67,3 @@
+ if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
+ // Instructions that read threadIdx are abviously divergent.
+ if (readsThreadIndex(II))
----------------
abviously -> obviously
================
Comment at: lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp:71
@@ +70,3 @@
+ return true;
+ // Handle the NVPTX atomic instrinsics which cannot be represented as an
+ // atomic IR instruction.
----------------
which -> that
http://reviews.llvm.org/D8576
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list