[PATCH] Divergence analysis for GPU programs

Thu Apr 2 22:48:49 PDT 2015

================
Comment at: lib/Analysis/DivergenceAnalysis.cpp:176
@@ +175,3 @@
+  }
+  if (Cond == nullptr)
+    return;
----------------
bjarke.roune wrote:
> jingyue wrote:
> > bjarke.roune wrote:
> > > Can Cond ever be null here, given that we only get to this point if *TI has been marked as a potentially divergent terminator instruction?
> > Since we run reachability on the data flow graph, the Worklist may contain TerminatorInsts (e.g., InvokeInst) that belong to none of the above three types. 
> Would a check for >= 2 successors do the trick? If not, is it necessary to pull out the conditions, or would it suffice to have a boolean HasCond? I wonder if BI->getCondition(), IBI->getAdress() and SI->getCondition() can ever be null.
InvokeInst also has two successors. BI->getCondition(), IBI->getAddress() and SI->getCondition() are all non-null. 

================
Comment at: lib/Analysis/DivergenceAnalysis.cpp:185
@@ +184,3 @@
+    return;
+  for (auto I = IPostDom->begin(); IPostDom->getFirstNonPHI() != I; ++I) {
+    if (Visited.insert(I).second)
----------------
jingyue wrote:
> bjarke.roune wrote:
> > It's better to only make one call to getFirstNonPHI(), since it runs in linear time, so this loop is otherwise quadratic in the number of phi nodes:
> > 
> > http://llvm.org/docs/doxygen/html/BasicBlock_8cpp_source.html#l00161
> Nice catch! Will fix. 
Done. 

================
Comment at: lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp:51
@@ +50,3 @@
+
+  if (const Instruction *I = dyn_cast<Instruction>(V)) {
+    // Without pointer analysis, we conservatively assume values loaded are
----------------
bjarke.roune wrote:
> The return value of function calls can also be divergent. I think this function doesn't account for that?
You're right. Done. 

================
Comment at: lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp:55
@@ +54,3 @@
+    if (isa<LoadInst>(I))
+      return true;
+    // Atomic instructions may cause divergence. Atomic instructions are
----------------
bjarke.roune wrote:
> jingyue wrote:
> > bjarke.roune wrote:
> > > If all the threads in a warp load the same address at the same time, I think that they should all get the same value. If that's right, then the analysis would remain conservative by letting loads of non-divergent pointers yield non-divergent values, regardless of aliasing.
> > That's a very good point! We can be optimistic for at least shared and const loads. 
> > 
> > The local address space is private to each thread. Consequently, generic loads need to be treated conservatively, because their addresses may be converted from local addresses. 
> I hadn't considered local memory. If we marked local addresses as sources of divergence, the only way I can think of where that might go wrong right now is if a local pointer value is stored into global memory and then read back again, then the thus laundered local pointer is used to store a divergent value and then that divergent value is read back again without being marked as divergent. I wonder if that's a common use case.
Bjarke, I am not sure I understand what you mean by "might go wrong". 

Marking values loaded from local address space as divergent is sound, i.e., if our analysis says a branch is non-divergent, it is indeed non-divergent. 

================
Comment at: lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp:58
@@ +57,3 @@
+    // executed sequentially across all threads in a warp. Therefore, an earlier
+    // executed thread may see different memory inputs than an later executed
+    // thread. For example, suppose *a = 0 initially.
----------------
bjarke.roune wrote:
> an -> a
Done. Thanks! 

================
Comment at: lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp:68
@@ +67,3 @@
+    if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
+      // Instructions that read threadIdx are abviously divergent.
+      if (readsThreadIndex(II))
----------------
bjarke.roune wrote:
> abviously -> obviously
Done. 

================
Comment at: lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp:71
@@ +70,3 @@
+        return true;
+      // Handle the NVPTX atomic instrinsics which cannot be represented as an
+      // atomic IR instruction.
----------------
bjarke.roune wrote:
> which -> that
Done.

http://reviews.llvm.org/D8576

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/