[PATCH] Adding support for NoDuplicate function attribute in CLang

David Tweed david.tweed at gmail.com
Thu Nov 21 01:23:15 PST 2013


Hi,

On Thu, Nov 21, 2013 at 8:54 AM, Pekka Jääskeläinen <
pekka.jaaskelainen at tut.fi> wrote:

> Hi all,
>
>
> On 11/20/2013 03:54 PM, David Tweed wrote:
>
>> Note that in the OpenCL use case, the name precisely describes the intent:
>> you can move barrier calls around, inline them, etc, if you can show that
>> the semantics of OpenCL code is the same but (on some particular
>> architectures) you aren't allowed to duplicate a given barrier call (due
>> to
>> implementation restrictions) even if otherwise the semantics was ok. This
>> relates to the LLVM function attribute named noduplicate (as visible in
>> the
>> patch, obviously)
>>
>
> While I think 'noduplicate' is a fine workaround for the problem at hand,
> I get a feeling it also "throws the baby out with the bath water" a bit,
> disallowing some legal optimizations on SPMD programs in the process.
>
>
My understanding of what was happening was that a _particular
implementation_ of
OpenCL would have the ability to add the noduplicate attribute on its
declaration of
barriers if that's necessary given their implementation of the barrier; if
an implementation
doesn't need it it can declare barrier without the qualifier.

AFAIU, in the specific OpenCL case one is safe if one can prove the location
> you copy the barrier (or even inject a completely new one) to is
> non-diverging.
>
> Just to be clear: this stems from a difference between OpenCL abstract
semantics and how
these things might be implemented on some particular compute platforms.
OpenCL
just requires that all work-items wait for every work-item to complete at
the _same_
barrier. If an implementation can determine the "OpenCL-level" identity of
a barrier
even after duplicating the call, it is free to do so and not annotate the
barrier prototype.
Some implementations determine barrier identity based upon "the program
counter" (in some sense) at the
point the barrier call: on these implementations duplicating the call (IF
control flow diverges, as
you point out) breaks the implementation.

In terms of non-diverging flow, isn't that the case where either it's
statically ascertainable
what the control flow is so you aren't duplicating the
call "in the final code" since the not taken branch is removed, or there's
a


> Can you give an example of a case where one cannot duplicate
> a barrier call if the control dependencies at the duplicated
> barrier call site do not change per work-item?
>

No.

Why and how could some architecture restrict that? AFAIU, it should not be
> able to differentiate the copy from any other (user written) barrier,
> as "additional synchronization" should be safe in that case.
>
> E.g.:
>
> for (uniform_loop) {
>   if (uniform_cond) {
>      do_something;
>   } else {
>      do_something_else;
>   }
>   barrier();
> }
>
> Loop unswitching here might produce:
>
> if (uniform_variable) {
>   for (uniform_loop) {
>      do_something;
>      barrier();
>   }
> } else {
>   for (uniform_loop) {
>      do_something_else;
>      barrier();
>   }
> }
>
> These two new loops might be more easily horizontally or
> vertically parallelized (e.g. vectorized) and the kernel
> semantics is still correct, right?
>

If the first example was written such that the inner condition was actually
based upon
an unknowable-but-uniform variable, then I can't see an issue with that.
The question is how often
one gets a condition which is a uniform variable which doesn't turn out to
be trivially determined
so that dead code elimination which means the end code has only one
barrier. (The current implementation
of things, AIUI, is purely local so that one can't have a chain of
transformations which temporarily duplicate
such a noduplicate call before deleting one later, but that's more of an
issue with the chain-of-transformations-each-valid
approach than the noduplicate attribute. But that's a much bigger problem)

-- 
cheers, dave tweed__________________________
high-performance computing and machine vision expert: david.tweed at gmail.com
"while having code so boring anyone can maintain it, use Python." --
attempted insult seen on slashdot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20131121/57194d1b/attachment.html>


More information about the cfe-commits mailing list