[cfe-dev] [PATCH] Gut InstCombiner::SimplifyMemTransfer.

Tue Aug 18 14:48:48 PDT 2015

Trying again with the right cfe-dev list address.

- Lang.

Yep - you're both right. Sorry - it has been too long since I've looked at
> this.
>
> So clang is emitting memcpys for aggregate copies instead of memmoves,
> even though the pointers might be equal. CC'ing cfe-dev, in case anyone
> there has any insights on why this is.
>
> Digging back through the clang logs it looks like this was a conscious
> decision, but it's not clear why it worked out this way:
>
> ------------------------------------------------------------------------
> r51566 | efriedma | 2008-05-26 05:59:39 -0700 (Mon, 26 May 2008) | 15 lines
>
> Emit memmove, not memcpy, for structure copies; this is unfortunately
> required for correctness in cases of copying a struct to itself or to
> an overlapping struct (itself for cases like *a = *a, and overlapping
> is possible with unions).
>
> Hopefully, this won't end up being a perf issue; LLVM *should* be able
> to optimize memmove to memcpy in a lot of cases, and for small copies
> the generated code *should* be mostly comparable. (In reality, LLVM
> is currently horrible at optimizing memmove, but that's a bug, not a
> fundamental issue.)
>
> gcc currently generates wrong code; that's
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667.
>
> ...
>
> ------------------------------------------------------------------------
> r65699 | lattner | 2009-02-28 10:18:58 -0800 (Sat, 28 Feb 2009) | 5 lines
>
> after going around in circles a few times, finally cave and emit structure
> copies with memcpy instead of memmove.  This matches what GCC does and if
> it
> causes a problem with a particular libc we can always fix it with a target
> hook.
>
>
> - Lang.
>
> On Tue, Aug 18, 2015 at 2:20 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>
>> ----- Original Message -----
>> > From: "Lang Hames" <lhames at gmail.com>
>> > To: "Hal Finkel" <hfinkel at anl.gov>
>> > Cc: "Commit Messages and Patches for LLVM" <llvm-commits at lists.llvm.org>,
>> "Chandler Carruth" <chandlerc at gmail.com>
>> > Sent: Tuesday, August 18, 2015 1:22:14 PM
>> > Subject: Re: [PATCH] Gut InstCombiner::SimplifyMemTransfer.
>> >
>> >
>> > Hi David, Hal,
>> >
>> >
>> > Thanks very much for the review!
>> >
>> >
>> > The clang issue should be fixed, but I think it's orthogonal to this
>> > patch: Since the memcpys that clang emits aren't annotated they'll
>> > be treated conservatively by my patch (i.e. not optimized).
>> >
>>
>> I don't understand. The logic in InstCombine applies to all memcpys that
>> are expended into individual loads and stores (a size of 1, 2, 4, or 8
>> bytes and properly aligned), and so you would end up annotating the
>> resulting memory accesses, no?
>>
>>  -Hal
>>
>> >
>> > Unless I've misunderstood that, I think it should be safe to land
>> > this once I make the auto* fix that David suggested?
>> >
>> >
>> > Cheers,
>> > Lang.
>> >
>> >
>> >
>> >
>> > On Sun, Aug 16, 2015 at 3:23 AM, Hal Finkel < hfinkel at anl.gov >
>> > wrote:
>> >
>> >
>> > ----- Original Message -----
>> > > From: "Lang Hames via llvm-commits" < llvm-commits at lists.llvm.org >
>> > > To: "Chandler Carruth" < chandlerc at gmail.com >
>> > > Cc: "Commit Messages and Patches for LLVM" <
>> > > llvm-commits at lists.llvm.org >
>> > > Sent: Wednesday, August 12, 2015 9:06:52 PM
>> > > Subject: Re: [PATCH] Gut InstCombiner::SimplifyMemTransfer.
>> > >
>> > >
>> > >
>> > > Another belated *ping*.
>> > >
>> >
>> > The SDAG change LGTM.
>> >
>> > The InstCombine change also LGTM, but it seems that we need to fix
>> > Clang before we can make this change. Let's add this under a flag
>> > for testing.
>> >
>> > -Hal
>> >
>> >
>> >
>> > >
>> > >
>> > > - Lang.
>> > >
>> > >
>> > > On Tue, Jun 30, 2015 at 2:16 PM, Lang Hames < lhames at gmail.com >
>> > > wrote:
>> > >
>> > >
>> > >
>> > > Belated ping. :)
>> > >
>> > >
>> > > - Lang.
>> > >
>> > > On Tue, May 12, 2015 at 2:11 PM, Lang Hames < lhames at gmail.com >
>> > > wrote:
>> > >
>> > >
>> > >
>> > > Gentle ping.
>> > >
>> > >
>> > > - Lang.
>> > >
>> > >
>> > >
>> > >
>> > > On Tue, Apr 28, 2015 at 11:51 AM, Lang Hames < lhames at gmail.com >
>> > > wrote:
>> > >
>> > >
>> > >
>> > > Ping / delayed reply.
>> > >
>> > >
>> > > DannyB - thanks for the extra context. :)
>> > >
>> > >
>> > > Chandler - any thoughts? Assuming you still want to keep this
>> > > transform, what do you think of my instcombine change? (i.e.
>> > > tagging
>> > > the load/store pointers as non-aliasing for memcpy).
>> > >
>> > >
>> > > Cheers,
>> > > Lang.
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Tue, Apr 21, 2015 at 9:21 PM, Daniel Berlin <
>> > > dberlin at dberlin.org
>> > > > wrote:
>> > >
>> > >
>> > > On Wed, Apr 8, 2015 at 10:38 PM, Chandler Carruth <
>> > > chandlerc at gmail.com > wrote:
>> > > > On Wed, Apr 8, 2015 at 10:15 PM Lang Hames < lhames at gmail.com >
>> > > > wrote:
>> > > >>
>> > > >> Hi David, Chandler,
>> > > >>
>> > > >> The attached patch guts the SimplifyMemTransfer method, which
>> > > >> turns small
>> > > >> memcpys (1/2/4/8 bytes) into load/store pairs. Turning memcpys
>> > > >> into
>> > > >> load/store pairs loses information, since we can no longer
>> > > >> assume
>> > > >> the source
>> > > >> and dest are non-overlapping. This is leading to some suboptimal
>> > > >> expansions
>> > > >> for small memcpys on AArch64 when -mno-unaligned-access is
>> > > >> turned
>> > > >> on (see
>> > > >> r234462). I suspect other architectures would suffer similar
>> > > >> issues.
>> > > >>
>> > > >> I assume this transform is an old workaround to simplify other
>> > > >> non-memcpy-aware IR transforms. These days I think most IR
>> > > >> transforms can
>> > > >> reason sensibly about memcpys, so I'm hoping this is safe to
>> > > >> remove. FWIW,
>> > > >> removing it didn't hit any regression tests except those that
>> > > >> were
>> > > >> verifying
>> > > >> that this optimisation was being applied, but then you wouldn't
>> > > >> really
>> > > >> expect it to hit any others.
>> > > >
>> > > >
>> > > > Heh. I tried to remove it before and it regressed a *lot* of
>> > > > performance.
>> > > > Have you measured it? I think there are many places that don't
>> > > > today reason
>> > > > about memcpy but do reason about loads and stores. Here is a
>> > > > partial list:
>> > > >
>> > > > - GVN
>> > >
>> > > Note:
>> > >
>> > > It reasons about memory intrinsics, including memcpy and memset,
>> > > and
>> > > will pull values out to forward.
>> > >
>> > > for memset, if the load is in the memset size, it will compute what
>> > > the value is.
>> > > For memcpy, it handles anything that is constant memory.
>> > >
>> > >
>> > > (Certainly, loads and stores are handled better, but ...)
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> >
>> >
>> > > _______________________________________________
>> > > llvm-commits mailing list
>> > > llvm-commits at lists.llvm.org
>> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>> > >
>> >
>> > --
>> > Hal Finkel
>> > Assistant Computational Scientist
>> > Leadership Computing Facility
>> > Argonne National Laboratory
>> >
>> >
>>
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150818/eae1528c/attachment.html>