[PATCH] D62614: Fix for the OCL/LC to failure on some OCLPerf tests

Alexander via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 30 04:38:11 PDT 2019

alex-t added inline comments.

Comment at: lib/Analysis/LegacyDivergenceAnalysis.cpp:180
   //   i++;
   //   if (foo(i)) ... // uniform
   // } while (i < tid);
rampitec wrote:
> Given that pass only runs when TTI.hasBranchDivergence() this change seems reasonable to me. I do not really believe an index of a loop with the divergent condition is uniform. It is only uniform across the enabled lanes, but not across of the whole wave. Basically you make it divergent or uniform depending on uses, which makes it a semi-uniform value, more or less. I think this logic needs a better description though.
Since all threads of the wavefront that are live in the loop body observe same value. Thus, formally the value is uniform.  The user outside the loop in LCSSA is, again formally,  uses another value - LCSSA PHI-node, that is divergent. The problem arises when converting LCSSA back to normal form we insert the copy SGPR to VGPR.

The truth is that the value inside the loop is uniform but NOT scalar!

The proper fix would be to split the decision by two independent  parts:
1. Is the value uniform - !DA->isDivergent(V)
2. Is it SALU or VALU:  TargetLowering::isVALU(V)  - this should check out-of-divergent-loop-users and as much other conditions as is necessary for the target.



More information about the llvm-commits mailing list