[PATCH] D40547: AMDGPU: Fix copying i1 value out of loop with non-uniform exit

Nicolai Hähnle via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 26 08:23:40 PST 2018


nhaehnle added inline comments.
Herald added a subscriber: hintonda.


================
Comment at: lib/Target/AMDGPU/Utils/AMDGPULaneDominator.cpp:13-15
+// 2. whenever B executes, every active lane during that execution of B was
+//    also active during the most recent execution of A.
+//
----------------
arsenm wrote:
> Are you really trying to find control flow equivalent regions? Or is this a active lanes are a subset?
They don't have to be equivalent. In the case of:
```
   A
  / \
 /   \
B     C
```
A lane-dominates both B and C, but they're not equivalent.


================
Comment at: lib/Target/AMDGPU/Utils/AMDGPULaneDominator.cpp:70
+
+  return true;
+}
----------------
t-tye wrote:
> Would this return true for the CFG:
> 
> B
> |
> |
> v
> A
> |
> |
> v
> end
> 
> Since no successors of A reaches A to cause false to be returned. But no check seems to be made to ensure that A traditionally-dominates B.
I'm afraid I'm not sure what the CFG you have in mind looks like. The function assumes that A traditionally-dominates B, the caller must ensure that (as per the comment).


https://reviews.llvm.org/D40547





More information about the llvm-commits mailing list