[PATCH] Gut InstCombiner::SimplifyMemTransfer.

Chandler Carruth chandlerc at gmail.com
Wed Apr 8 22:38:25 PDT 2015


On Wed, Apr 8, 2015 at 10:15 PM Lang Hames <lhames at gmail.com> wrote:

> Hi David, Chandler,
>
> The attached patch guts the SimplifyMemTransfer method, which turns small
> memcpys (1/2/4/8 bytes) into load/store pairs. Turning memcpys into
> load/store pairs loses information, since we can no longer assume the
> source and dest are non-overlapping. This is leading to some suboptimal
> expansions for small memcpys on AArch64 when -mno-unaligned-access is
> turned on (see r234462). I suspect other architectures would suffer similar
> issues.
>
> I assume this transform is an old workaround to simplify other
> non-memcpy-aware IR transforms. These days I think most IR transforms can
> reason sensibly about memcpys, so I'm hoping this is safe to remove. FWIW,
> removing it didn't hit any regression tests except those that were
> verifying that this optimisation was being applied, but then you wouldn't
> really expect it to hit any others.
>

Heh. I tried to remove it before and it regressed a *lot* of performance.
Have you measured it? I think there are many places that don't today reason
about memcpy but do reason about loads and stores. Here is a partial list:

- GVN
- ValueTracking.cpp's available loaded value (or whatever its called) which
drives load combining and store-to-load forwarding throughout instcombine
and the IR
- EarlyCSE
- LoopVectorize

I thought about fixing all of this, but it seems really complicated and to
have very little value. Loads and stores and SSA values are really useful.
Do you see any other way to solve the problem of non-overlapping
information?


>
> If this transform really is useful then we should probably revisit the
> cut-off: 8-bytes isn't much these days.
>

Yea, this has been kind of horrible. I think the correct heuristic would be
when the size is one for which we have a legal integer type.


> Perhaps we should also only apply it if the alignment on the memcpy is
> sufficiently high?
>

Is an under aligned memcpy really that much better than an under aligned
load and store???

>
> Cheers,
> Lang.
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150409/b469f1c0/attachment.html>


More information about the llvm-commits mailing list