[PATCH] Divergence analysis for GPU programs

Jingyue Wu jingyue at google.com
Fri Apr 3 15:47:28 PDT 2015


http://reviews.llvm.org/D8576 generalizes the interface to
isDivergent(Value *) and isUniform(Value *). PTAL.

On Fri, Apr 3, 2015 at 10:36 AM Jingyue Wu <jingyue at google.com> wrote:

> On Fri, Apr 3, 2015 at 6:28 AM Fernando Magno Quintao Pereira <
> fernando at dcc.ufmg.br> wrote:
>
>> Hi Jingyue.
>>
>>     I went over your code today. It looks very nice to me. I believe
>> the programming style is very clear, and the analysis seem to be
>> correct to me. I am sending you below a few observations.
>>
>> Regards,
>>
>> Fernando
>>
>> ---
>>
>> 1) I think that the more recent TOPLAS paper, "Divergence Analysis -
>> Sampaio, Souza, Collange, Pereira, 2013", is a better source of
>> information than our original PACT publication (for the comment in
>> lines 31-33).
>>
>
> Ack'ed. Will fix.
>
>
>>
>> 2) Would it be possible to have a function such as "bool
>> isUniformValue(const Value *V) const" or (just the same) "bool
>> isDivergentValue(const Value *V) const" in the public interface of the
>> analysis? In fact, this is more general than "isDivergentBranch"
>> (lines 107-109), and it could help the register allocator, for
>> instance.
>>
>
> Agreed. The only concern I have is we need to keep all divergent
> instructions (as opposed to all divergent branches) for the users of the
> divergence analysis. I think it's fine given this analysis already spends
> this amount of space when computing divergent values.
>
>
>>
>> 3) For non-structured codes, is it not possible that your method
>> exploreSyncDependency may cause the same node to be visited more than
>> once? Imagine a CFG that is like a butterfly, like the one in the PDF
>> I sent you. See line 290. You could avoid this by interrupting the
>> search once you visit a divergent instruction.
>>
>
> I don't get this. Both exploreSyncDependency and exploreDataDependency
> check the visited flag of a value before adding it to the work list. I
> think this is enough for ensuring each value is explored at most once.
>
>
>>
>> 4) I think you are flagging a variable that is divergent outside a
>> loop as divergent wherever it is alive, even if it is uniform inside
>> the loop. That is not wrong, but it may lead to less precise results.
>> Imagine, for instance:
>>
>> int i = 0;
>> while (i < tid) {
>>   i++
>>   if (i % 0) {      // i is uniform for every active thread
>>     int x = 2 * i;  // x is uniform for every active thread
>>     v[tid] = x;
>>   }
>> }
>> v[tid] = i; // i is divergent for every thread
>>
>> A more precise analysis would split the live range of i right after
>> the loop, and you would end up with something like:
>>
>> int i = 0;
>> while (i < tid) {
>>   i++
>>   if (i % 0) {
>>     int x = 2 * i;
>>     v[tid] = x;
>>   }
>> }
>> i0 = phi(i); // i0 is divergent, for it is used outside the influence
>> region.
>> v[tid] = i0;
>>
>>
> This is a very good point, but I need to mention two things.
> 1. My implementation does not mark the i in your example as divergent; it
> only marks its out-of-the-loop users as divergent. Therefore, if you worry
> about whether x is considered divergent, the answer is no.
> 2. Divergence analysis is supposed to be readonly and shouldn't modify the
> program. However, if we want to split the live range of i for a more
> precise model, we can run LoopSimplify+LCSSA before the divergence
> analysis. LCSSA (http://llvm.org/docs/doxygen/html/LCSSA_8cpp_source.html)
> in particular will rewrite all out-of-loop users to a PHI node with a
> single incoming value (see the header comments in LCSSA.cpp). I believe
> when running on an LCSSA form, my current implementation can distinguish
> the two live ranges. I'll double check that.
> P.S. LCSSA only transforms natural loops.
>
>
>>
>> On 4/3/15, Jingyue Wu <jingyue at google.com> wrote:
>> > Hi Fernando,
>> >
>> > I fixed the sync dependency computation in this update. I used to
>> consider
>> > only the if-then-else case; this version accounts for the loop case.
>> >
>> > Please take a look when you have time. Thanks a lot for your help!
>> >
>> > Jingyue
>> >
>> > ---------- Forwarded message ---------
>> > From: Jingyue Wu <jingyue at google.com>
>> > Date: Thu, Apr 2, 2015 at 10:48 PM
>> > Subject: Re: [PATCH] Divergence analysis for GPU programs
>> > To: <jingyue at google.com>, <resistor at mac.com>, <hfinkel at anl.gov>, <
>> > eliben at google.com>, <meheff at google.com>, <justin.holewinski at gmail.com>
>> > Cc: <bjarke.roune at gmail.com>, <madhur13490 at gmail.com>, <
>> > thomas.stellard at amd.com>, <dberlin at dberlin.org>, <echristo at gmail.com>,
>> <
>> > llvm-commits at cs.uiuc.edu>
>> >
>> >
>> > This update fixes sync dependency computation. If a value is used
>> outside
>> > of
>> > the loop in that it is defined, the user is sync dependent on the exit
>> > condition of the loop.
>> >
>> >
>> > http://reviews.llvm.org/D8576
>> >
>> > Files:
>> >   include/llvm/Analysis/Passes.h
>> >   include/llvm/Analysis/TargetTransformInfo.h
>> >   include/llvm/Analysis/TargetTransformInfoImpl.h
>> >   include/llvm/CodeGen/BasicTTIImpl.h
>> >   include/llvm/InitializePasses.h
>> >   include/llvm/LinkAllPasses.h
>> >   lib/Analysis/Analysis.cpp
>> >   lib/Analysis/CMakeLists.txt
>> >   lib/Analysis/DivergenceAnalysis.cpp
>> >   lib/Analysis/TargetTransformInfo.cpp
>> >   lib/Target/NVPTX/NVPTXTargetTransformInfo.cpp
>> >   lib/Target/NVPTX/NVPTXTargetTransformInfo.h
>> >   test/Analysis/DivergenceAnalysis/NVPTX/diverge.ll
>> >   test/Analysis/DivergenceAnalysis/NVPTX/lit.local.cfg
>> >
>> > EMAIL PREFERENCES
>> >   http://reviews.llvm.org/settings/panel/emailpreferences/
>> >
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150403/f442b738/attachment.html>


More information about the llvm-commits mailing list