[PATCH] Gut InstCombiner::SimplifyMemTransfer.

Sun Aug 16 03:08:32 PDT 2015

[+Richard, John, Chandler]

----- Original Message -----
> From: "David Majnemer via llvm-commits" <llvm-commits at lists.llvm.org>
> To: "Lang Hames" <lhames at gmail.com>
> Cc: "Commit Messages and Patches for LLVM" <llvm-commits at lists.llvm.org>
> Sent: Wednesday, August 12, 2015 10:35:16 PM
> Subject: Re: [PATCH] Gut InstCombiner::SimplifyMemTransfer.
> 
> 
> 
> Hi Lang,
> 
> 
> I reviewed the InstCombine side and it looks great, all that I ask is
> that you replace the `Instruction *` with `auto *` because the type
> is replicated in the `dyn_cast`.
> 
> 
> My only concern is that clang will actually create calls to the
> memcpy intrinsic from within clang with pointers which exactly
> alias:
> http://clang.llvm.org/doxygen/CGExprAgg_8cpp_source.html#l01448
> 

That seems very broken. We need to fix this. In fact, the comment admits as much:

01454   // memcpy is not defined if the source and destination pointers are exactly
01455   // equal, but other compilers do this optimization, and almost every memcpy
01456   // implementation handles this case safely.  If there is a libc that does not
01457   // safely handle this, we can add a target hook.

Is there any reason not to emit memmove instead?

 -Hal

> 
> I cannot confess to being an alias-analysis expert but my concern is
> that annotating the memcpy as such could create some interesting
> paradoxes. Those more familiar with AA should chime in.
> 
> 
> I'm afraid I cannot provide any guidance on the SelectionDAG changes.
> 
> 
> Thanks
> --
> David Majnemer
> 
> 
> On Wed, Aug 12, 2015 at 10:06 PM, Lang Hames via llvm-commits <
> llvm-commits at lists.llvm.org > wrote:
> 
> 
> 
> Another belated *ping*.
> 
> 
> 
> - Lang.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Tue, Jun 30, 2015 at 2:16 PM, Lang Hames < lhames at gmail.com >
> wrote:
> 
> 
> 
> Belated ping. :)
> 
> 
> - Lang.
> 
> 
> 
> 
> On Tue, May 12, 2015 at 2:11 PM, Lang Hames < lhames at gmail.com >
> wrote:
> 
> 
> 
> Gentle ping.
> 
> 
> - Lang.
> 
> 
> 
> 
> On Tue, Apr 28, 2015 at 11:51 AM, Lang Hames < lhames at gmail.com >
> wrote:
> 
> 
> 
> Ping / delayed reply.
> 
> 
> DannyB - thanks for the extra context. :)
> 
> 
> Chandler - any thoughts? Assuming you still want to keep this
> transform, what do you think of my instcombine change? (i.e. tagging
> the load/store pointers as non-aliasing for memcpy).
> 
> 
> Cheers,
> Lang.
> 
> 
> 
> 
> 
> 
> On Tue, Apr 21, 2015 at 9:21 PM, Daniel Berlin < dberlin at dberlin.org
> > wrote:
> 
> 
> On Wed, Apr 8, 2015 at 10:38 PM, Chandler Carruth <
> chandlerc at gmail.com > wrote:
> > On Wed, Apr 8, 2015 at 10:15 PM Lang Hames < lhames at gmail.com >
> > wrote:
> >> 
> >> Hi David, Chandler,
> >> 
> >> The attached patch guts the SimplifyMemTransfer method, which
> >> turns small
> >> memcpys (1/2/4/8 bytes) into load/store pairs. Turning memcpys
> >> into
> >> load/store pairs loses information, since we can no longer assume
> >> the source
> >> and dest are non-overlapping. This is leading to some suboptimal
> >> expansions
> >> for small memcpys on AArch64 when -mno-unaligned-access is turned
> >> on (see
> >> r234462). I suspect other architectures would suffer similar
> >> issues.
> >> 
> >> I assume this transform is an old workaround to simplify other
> >> non-memcpy-aware IR transforms. These days I think most IR
> >> transforms can
> >> reason sensibly about memcpys, so I'm hoping this is safe to
> >> remove. FWIW,
> >> removing it didn't hit any regression tests except those that were
> >> verifying
> >> that this optimisation was being applied, but then you wouldn't
> >> really
> >> expect it to hit any others.
> > 
> > 
> > Heh. I tried to remove it before and it regressed a *lot* of
> > performance.
> > Have you measured it? I think there are many places that don't
> > today reason
> > about memcpy but do reason about loads and stores. Here is a
> > partial list:
> > 
> > - GVN
> 
> Note:
> 
> It reasons about memory intrinsics, including memcpy and memset, and
> will pull values out to forward.
> 
> for memset, if the load is in the memset size, it will compute what
> the value is.
> For memcpy, it handles anything that is constant memory.
> 
> 
> (Certainly, loads and stores are handled better, but ...)
> 
> 
> 
> 
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> 
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory