[PATCH] Gut InstCombiner::SimplifyMemTransfer.

Thu Apr 9 19:21:11 PDT 2015

----- Original Message -----
> From: "Lang Hames" <lhames at gmail.com>
> To: "Chandler Carruth" <chandlerc at gmail.com>
> Cc: "Commit Messages and Patches for LLVM" <llvm-commits at cs.uiuc.edu>
> Sent: Thursday, April 9, 2015 5:18:56 PM
> Subject: Re: [PATCH] Gut InstCombiner::SimplifyMemTransfer.
> 
> 
> 
> Hi Chandler,
> 
> 
> How about attaching noalias metadata to the load/store pointers when
> we simplify memcpy. Then the backend could use noalias info to
> reconstitute memcpys for misaligned load/store pairs.
> 
> 
> Tentative patch attached - I'm still trying to wrap my head around
> the noalias metadata stuff, but I think this is heading in the right
> direction.

That should indeed mark the source and destination accesses as mutually-non-aliasing. It is a bit overkill, because you don't need to explicitly mark both directions (if A does not alias with B, then B does not alias with A); ScopedNoAliasAA already checks both directions (in ScopedNoAliasAA.cpp):

  if (!mayAliasInScopes(AScopes, BNoAlias))
    return NoAlias;

  if (!mayAliasInScopes(BScopes, ANoAlias))
    return NoAlias;

If you want to use AA during CodeGen (by which we include SDAG), the target will need to return true from useAA().

 -Hal

> 
> 
> - Lang.
> 
> 
> 
> On Wed, Apr 8, 2015 at 11:29 PM, Lang Hames < lhames at gmail.com >
> wrote:
> 
> 
> 
> > Is the problem specific to misaligned loads and stores? Because
> > that seems much easier to solve.
> 
> 
> Mostly, but with a twist: Misaligned loads/stores are fine if your
> target supports them (I don't imagine they're a problem on x86), but
> as far as I know we don't have a way to express to the mid-level
> optimisers whether a misaligned load/store will be *bad* on the
> current target CPU (in the sense that SelectionDAG will have to
> expand it to something much larger). The relevant information is
> currently on TargetLowering. Maybe we need to move it somewhere more
> accessible.
> 
> 
> > But why does lowering as memcpy help? Essentially, I don't
> > understand why we can't use exactly the same lowering strategy
> > that memcpy (or memmove for that matter) would use and get the
> > same effect.
> 
> 
> As far as I can see the only way memcpy helps is by conveying that
> the source and dest are non-overlapping. (None of this applies to
> memmove - as far as I know that function can be converted to a
> load/store pair with no loss of information).
> Take the simple case where memcpy lowering is just going to issue a
> series of load-byte / store-byte instructions: That's not a legal
> way to to lower an arbitrary load/store pair, since in the later
> case the source and dest may overlap, but it is legal for a memcpy.
> 
> 
> > FWIW, my suggestion about using legal integer types should not
> > raise the cap at all, it should lower it on specific targets where
> > we can't actually fit an 8-byte integer in a register.
> 
> 
> Oh, right. Sorry - I mis-parsed that as legal primitive type, and
> imagined applying it to vector types. That raises an interesting
> question though: Would it be useful to apply this logic to vector
> types too?
> 
> 
> Cheers,
> Lang.
> 
> 
> 
> 
> On Wed, Apr 8, 2015 at 11:11 PM, Chandler Carruth <
> chandlerc at gmail.com > wrote:
> 
> 
> 
> 
> On Wed, Apr 8, 2015 at 10:57 PM Lang Hames < lhames at gmail.com >
> wrote:
> 
> 
> 
> Hi Chandler,
> 
> 
> Not as easy as I was hoping then.
> 
> 
> 
> > Do you see any other way to solve the problem of non-overlapping
> > information?
> 
> 
> 
> I'll have to do some reading. If there's any aliasing metadata that
> we can attach to express that the pointers are disjoint, that would
> work: In SelectionDAGBuilder we could detect disjoint, misaligned
> load/store pairs where the load has no other users and use the
> memcpy expansion instead.
> 
> 
> 
> Is the problem specific to misaligned loads and stores? Because that
> seems much easier to solve.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> > Is an under aligned memcpy really that much better than an under
> > aligned load and store???
> 
> 
> 
> 
> It saves a bit of shifting and masking as you try to reconstruct the
> full iN value in a register.
> 
> 
> But why does lowering as memcpy help? Essentially, I don't understand
> why we can't use exactly the same lowering strategy that memcpy (or
> memmove for that matter) would use and get the same effect.
> 
> 
> If your concern is code size, I'm honestly still surprised that
> memcpy is smaller, but fine, emit the call *whenever* you end up
> with misaligned loads and stores?
> 
> 
> 
> 
> This would be exacerbated if we raised the size cap. I'll see if I
> can get you some numbers.
> 
> 
> FWIW, my suggestion about using legal integer types should not raise
> the cap at all, it should lower it on specific targets where we
> can't actually fit an 8-byte integer in a register.
> 
> 
> Anyways, I don't think we need numbers if the primary concern is
> misaligned loads and stores. I just think that there has to be
> *some* lowering strategy that works in general for misaligned loads
> and stores and is no worse than calling memcpy.
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory