[llvm-dev] RFC: Removal of noduplicate attribute

Finkel, Hal J. via llvm-dev llvm-dev at lists.llvm.org
Tue Oct 29 11:50:55 PDT 2019


On 10/29/19 11:56 AM, Savonichev, Andrew via llvm-dev wrote:
> On 10/29, Nicolai Hähnle-Montoro wrote:
>> On Tue, Oct 29, 2019 at 11:57 AM Savonichev, Andrew via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>>>> These are good points. I think the first question should be: Do we know
>>>> of any active users of this attribute right now? If not, deprecation
>>>> seems like something we could do, e.g., through a warning in clang and
>>>> in the middle-end to ensure other front-ends are aware of it as well.
>>> Noduplicate attribute is still used by the Intel OpenCL Compiler for CPU.
>>> The main use case is to prevent loop unroll when an OpenCL barrier is called
>>> within a loop. Although such loop can be unrolled and keep its semantic intact, but
>>> this introduces a lot of distinct barrier calls, and each of them has to
>>> be handled separately.
>>>
>>> In other words, "noduplicate" serves as a hint to not unroll a loop if a
>>> certain function is called in a loop body.
>> I don't quite understand the reasoning behind this. Is it because your
>> backend turns each individual barrier call into a large chunk of code?
> It is even worse than just a large chunk of code: in order to support OpenCL
> barrier on CPU (at least in our implementation) we have to significantly change
> control flow across the entire call chain. Evgeniy Tyurin gave a talk about this
> at the last year LLVM'Dev[1][2], and a short summary is: OpenCL barriers on CPU
> are complicated, and they are *very* expensive for performance and compile time.
>
> [1]: https://www.youtube.com/watch?v=Mm5ATyqm7Rw
> [2]: https://llvm.org/devmtg/2018-10/slides/Tyurin-ImplementingOpenCLCompiler.pdf
>
>> If so, would it be a long-term viable alternative to inform the
>> various code size heuristics about this instead of using
>> `noduplicate`?
> I think so. If we can tell standard LLVM optimizations to not make several
> calls out of one call, that should be good enough. Although it is exactly the
> meaning of the current `noduplicate' attribute, so I'm not sure what will be
> the difference.
>
> Another related problem is the fact that the OpenCL barrier is not an LLVM
> intrinsic - it is a regular function (declaration) that has the attribute.
> If we want to inform standard LLVM optimizations about it, this function
> should be changed to an intrinsic, right?


That's an option. It's also possible to teach TargetLibraryInfo about 
it. The optimizer knows about malloc(), but that's not an intrinsic.

  -Hal


>
-- 
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-dev mailing list