[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer
Robert Lougher
rob.lougher at gmail.com
Wed Dec 3 18:21:12 PST 2014
On 2 December 2014 at 22:18, Alex Rosenberg <alexr at leftfield.org> wrote:
> On Dec 3, 2014, at 6:12 AM, Eric Christopher <echristo at gmail.com> wrote:
>
> On Tue Dec 02 2014 at 12:12:01 PM Robert Lougher <rob.lougher at gmail.com>
> wrote:
>>
>> On 2 December 2014 at 19:57, Joerg Sonnenberger <joerg at britannica.bec.de>
>> wrote:
>> > On Tue, Dec 02, 2014 at 07:23:01PM +0000, Robert Lougher wrote:
>> >> In feedback from game studios a common issue is the replacement of
>> >> loops with calls to memcpy/memset. These loops are often
>> >> hand-optimised, and highly-efficient and the developers strongly want
>> >> a way to control the compiler (i.e. leave my loop alone).
>> >
>> > I doubt that. If anything, it means the lowering of the intrinsic is
>> > bad, not that the transformation should not happen.
>> >
>> > Joerg
>>
>> Yes, that's why I talked about variable and constant trip-counts. For
>> constant loops there generally isn't a problem, as they can be lowered
>> inline (if small). Variable loops, however, get expanded into a
>> library call.
>>
>
> So the biggest problem is that you don't want a call and would prefer to
> have inline memcpy code everywhere or something else? If the memcpy isn't
> being lowered efficiently I'm curious as to what isn't being lowered well.
>
>
> Our C library amplifies this problem by being in a dynamic library, so the
> call has additional overhead, which for small trip counts swamps the
> copy/set.
>
> Certainly, the lowering can be better across the many cases as discussed
> elsewhere in this thread.
>
It's also worth mentioning that when the loop-idiom recognizer is
disabled the loop vectorizer steps in, and will vectorize the loop.
Rob.
>
> Alex
More information about the llvm-dev
mailing list