[PATCH] Gut InstCombiner::SimplifyMemTransfer.

Lang Hames via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 18 14:46:18 PDT 2015


Yep - you're both right. Sorry - it has been too long since I've looked at
this.

So clang is emitting memcpys for aggregate copies instead of memmoves, even
though the pointers might be equal. CC'ing cfe-dev, in case anyone there
has any insights on why this is.

Digging back through the clang logs it looks like this was a conscious
decision, but it's not clear why it worked out this way:

------------------------------------------------------------------------
r51566 | efriedma | 2008-05-26 05:59:39 -0700 (Mon, 26 May 2008) | 15 lines

Emit memmove, not memcpy, for structure copies; this is unfortunately
required for correctness in cases of copying a struct to itself or to
an overlapping struct (itself for cases like *a = *a, and overlapping
is possible with unions).

Hopefully, this won't end up being a perf issue; LLVM *should* be able
to optimize memmove to memcpy in a lot of cases, and for small copies
the generated code *should* be mostly comparable. (In reality, LLVM
is currently horrible at optimizing memmove, but that's a bug, not a
fundamental issue.)

gcc currently generates wrong code; that's
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32667.

...

------------------------------------------------------------------------
r65699 | lattner | 2009-02-28 10:18:58 -0800 (Sat, 28 Feb 2009) | 5 lines

after going around in circles a few times, finally cave and emit structure
copies with memcpy instead of memmove.  This matches what GCC does and if it
causes a problem with a particular libc we can always fix it with a target
hook.


- Lang.

On Tue, Aug 18, 2015 at 2:20 PM, Hal Finkel <hfinkel at anl.gov> wrote:

> ----- Original Message -----
> > From: "Lang Hames" <lhames at gmail.com>
> > To: "Hal Finkel" <hfinkel at anl.gov>
> > Cc: "Commit Messages and Patches for LLVM" <llvm-commits at lists.llvm.org>,
> "Chandler Carruth" <chandlerc at gmail.com>
> > Sent: Tuesday, August 18, 2015 1:22:14 PM
> > Subject: Re: [PATCH] Gut InstCombiner::SimplifyMemTransfer.
> >
> >
> > Hi David, Hal,
> >
> >
> > Thanks very much for the review!
> >
> >
> > The clang issue should be fixed, but I think it's orthogonal to this
> > patch: Since the memcpys that clang emits aren't annotated they'll
> > be treated conservatively by my patch (i.e. not optimized).
> >
>
> I don't understand. The logic in InstCombine applies to all memcpys that
> are expended into individual loads and stores (a size of 1, 2, 4, or 8
> bytes and properly aligned), and so you would end up annotating the
> resulting memory accesses, no?
>
>  -Hal
>
> >
> > Unless I've misunderstood that, I think it should be safe to land
> > this once I make the auto* fix that David suggested?
> >
> >
> > Cheers,
> > Lang.
> >
> >
> >
> >
> > On Sun, Aug 16, 2015 at 3:23 AM, Hal Finkel < hfinkel at anl.gov >
> > wrote:
> >
> >
> > ----- Original Message -----
> > > From: "Lang Hames via llvm-commits" < llvm-commits at lists.llvm.org >
> > > To: "Chandler Carruth" < chandlerc at gmail.com >
> > > Cc: "Commit Messages and Patches for LLVM" <
> > > llvm-commits at lists.llvm.org >
> > > Sent: Wednesday, August 12, 2015 9:06:52 PM
> > > Subject: Re: [PATCH] Gut InstCombiner::SimplifyMemTransfer.
> > >
> > >
> > >
> > > Another belated *ping*.
> > >
> >
> > The SDAG change LGTM.
> >
> > The InstCombine change also LGTM, but it seems that we need to fix
> > Clang before we can make this change. Let's add this under a flag
> > for testing.
> >
> > -Hal
> >
> >
> >
> > >
> > >
> > > - Lang.
> > >
> > >
> > > On Tue, Jun 30, 2015 at 2:16 PM, Lang Hames < lhames at gmail.com >
> > > wrote:
> > >
> > >
> > >
> > > Belated ping. :)
> > >
> > >
> > > - Lang.
> > >
> > > On Tue, May 12, 2015 at 2:11 PM, Lang Hames < lhames at gmail.com >
> > > wrote:
> > >
> > >
> > >
> > > Gentle ping.
> > >
> > >
> > > - Lang.
> > >
> > >
> > >
> > >
> > > On Tue, Apr 28, 2015 at 11:51 AM, Lang Hames < lhames at gmail.com >
> > > wrote:
> > >
> > >
> > >
> > > Ping / delayed reply.
> > >
> > >
> > > DannyB - thanks for the extra context. :)
> > >
> > >
> > > Chandler - any thoughts? Assuming you still want to keep this
> > > transform, what do you think of my instcombine change? (i.e.
> > > tagging
> > > the load/store pointers as non-aliasing for memcpy).
> > >
> > >
> > > Cheers,
> > > Lang.
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Apr 21, 2015 at 9:21 PM, Daniel Berlin <
> > > dberlin at dberlin.org
> > > > wrote:
> > >
> > >
> > > On Wed, Apr 8, 2015 at 10:38 PM, Chandler Carruth <
> > > chandlerc at gmail.com > wrote:
> > > > On Wed, Apr 8, 2015 at 10:15 PM Lang Hames < lhames at gmail.com >
> > > > wrote:
> > > >>
> > > >> Hi David, Chandler,
> > > >>
> > > >> The attached patch guts the SimplifyMemTransfer method, which
> > > >> turns small
> > > >> memcpys (1/2/4/8 bytes) into load/store pairs. Turning memcpys
> > > >> into
> > > >> load/store pairs loses information, since we can no longer
> > > >> assume
> > > >> the source
> > > >> and dest are non-overlapping. This is leading to some suboptimal
> > > >> expansions
> > > >> for small memcpys on AArch64 when -mno-unaligned-access is
> > > >> turned
> > > >> on (see
> > > >> r234462). I suspect other architectures would suffer similar
> > > >> issues.
> > > >>
> > > >> I assume this transform is an old workaround to simplify other
> > > >> non-memcpy-aware IR transforms. These days I think most IR
> > > >> transforms can
> > > >> reason sensibly about memcpys, so I'm hoping this is safe to
> > > >> remove. FWIW,
> > > >> removing it didn't hit any regression tests except those that
> > > >> were
> > > >> verifying
> > > >> that this optimisation was being applied, but then you wouldn't
> > > >> really
> > > >> expect it to hit any others.
> > > >
> > > >
> > > > Heh. I tried to remove it before and it regressed a *lot* of
> > > > performance.
> > > > Have you measured it? I think there are many places that don't
> > > > today reason
> > > > about memcpy but do reason about loads and stores. Here is a
> > > > partial list:
> > > >
> > > > - GVN
> > >
> > > Note:
> > >
> > > It reasons about memory intrinsics, including memcpy and memset,
> > > and
> > > will pull values out to forward.
> > >
> > > for memset, if the load is in the memset size, it will compute what
> > > the value is.
> > > For memcpy, it handles anything that is constant memory.
> > >
> > >
> > > (Certainly, loads and stores are handled better, but ...)
> > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> > > _______________________________________________
> > > llvm-commits mailing list
> > > llvm-commits at lists.llvm.org
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> > >
> >
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> >
> >
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150818/2130e003/attachment.html>


More information about the llvm-commits mailing list