[llvm-dev] RFC: Removal of noduplicate attribute

Savonichev, Andrew via llvm-dev llvm-dev at lists.llvm.org
Tue Oct 29 09:56:55 PDT 2019

On 10/29, Nicolai Hähnle-Montoro wrote:
> On Tue, Oct 29, 2019 at 11:57 AM Savonichev, Andrew via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > > These are good points. I think the first question should be: Do we know
> > > of any active users of this attribute right now? If not, deprecation
> > > seems like something we could do, e.g., through a warning in clang and
> > > in the middle-end to ensure other front-ends are aware of it as well.
> >
> > Noduplicate attribute is still used by the Intel OpenCL Compiler for CPU.
> > The main use case is to prevent loop unroll when an OpenCL barrier is called
> > within a loop. Although such loop can be unrolled and keep its semantic intact, but
> > this introduces a lot of distinct barrier calls, and each of them has to
> > be handled separately.
> >
> > In other words, "noduplicate" serves as a hint to not unroll a loop if a
> > certain function is called in a loop body.
> I don't quite understand the reasoning behind this. Is it because your
> backend turns each individual barrier call into a large chunk of code?

It is even worse than just a large chunk of code: in order to support OpenCL
barrier on CPU (at least in our implementation) we have to significantly change
control flow across the entire call chain. Evgeniy Tyurin gave a talk about this
at the last year LLVM'Dev[1][2], and a short summary is: OpenCL barriers on CPU 
are complicated, and they are *very* expensive for performance and compile time.

[1]: https://www.youtube.com/watch?v=Mm5ATyqm7Rw
[2]: https://llvm.org/devmtg/2018-10/slides/Tyurin-ImplementingOpenCLCompiler.pdf

> If so, would it be a long-term viable alternative to inform the
> various code size heuristics about this instead of using
> `noduplicate`?

I think so. If we can tell standard LLVM optimizations to not make several
calls out of one call, that should be good enough. Although it is exactly the
meaning of the current `noduplicate' attribute, so I'm not sure what will be
the difference.

Another related problem is the fact that the OpenCL barrier is not an LLVM
intrinsic - it is a regular function (declaration) that has the attribute.
If we want to inform standard LLVM optimizations about it, this function
should be changed to an intrinsic, right?


More information about the llvm-dev mailing list