[PATCH] Gut InstCombiner::SimplifyMemTransfer.

Lang Hames lhames at gmail.com
Wed Apr 8 23:29:38 PDT 2015


> Is the problem specific to misaligned loads and stores? Because that
seems much easier to solve.

Mostly, but with a twist: Misaligned loads/stores are fine if your target
supports them (I don't imagine they're a problem on x86), but as far as I
know we don't have a way to express to the mid-level optimisers whether a
misaligned load/store will be *bad* on the current target CPU (in the sense
that SelectionDAG will have to expand it to something much larger). The
relevant information is currently on TargetLowering. Maybe we need to move
it somewhere more accessible.

> But why does lowering as memcpy help? Essentially, I don't understand why
we can't use exactly the same lowering strategy that memcpy (or memmove for
that matter) would use and get the same effect.

As far as I can see the only way memcpy helps is by conveying that the
source and dest are non-overlapping. (None of this applies to memmove - as
far as I know that function can be converted to a load/store pair with no
loss of information).
Take the simple case where memcpy lowering is just going to issue a series
of load-byte / store-byte instructions: That's not a legal way to to lower
an arbitrary load/store pair, since in the later case the source and dest
may overlap, but it is legal for a memcpy.

> FWIW, my suggestion about using legal integer types should not raise the
cap at all, it should lower it on specific targets where we can't actually
fit an 8-byte integer in a register.

Oh, right. Sorry - I mis-parsed that as legal primitive type, and imagined
applying it to vector types. That raises an interesting question though:
Would it be useful to apply this logic to vector types too?

Cheers,
Lang.

On Wed, Apr 8, 2015 at 11:11 PM, Chandler Carruth <chandlerc at gmail.com>
wrote:

> On Wed, Apr 8, 2015 at 10:57 PM Lang Hames <lhames at gmail.com> wrote:
>
>> Hi Chandler,
>>
>> Not as easy as I was hoping then.
>>
>> > Do you see any other way to solve the problem of non-overlapping
>> information?
>>
>> I'll have to do some reading. If there's any aliasing metadata that we
>> can attach to express that the pointers are disjoint, that would work: In
>> SelectionDAGBuilder we could detect disjoint, misaligned load/store pairs
>> where the load has no other users and use the memcpy expansion instead.
>>
>
> Is the problem specific to misaligned loads and stores? Because that seems
> much easier to solve.
>
>
>>
>> > Is an under aligned memcpy really that much better than an under
>> aligned load and store???
>>
>> It saves a bit of shifting and masking as you try to reconstruct the full
>> iN value in a register.
>>
>
> But why does lowering as memcpy help? Essentially, I don't understand why
> we can't use exactly the same lowering strategy that memcpy (or memmove for
> that matter) would use and get the same effect.
>
> If your concern is code size, I'm honestly still surprised that memcpy is
> smaller, but fine, emit the call *whenever* you end up with misaligned
> loads and stores?
>
>
>> This would be exacerbated if we raised the size cap. I'll see if I can
>> get you some numbers.
>>
>
> FWIW, my suggestion about using legal integer types should not raise the
> cap at all, it should lower it on specific targets where we can't actually
> fit an 8-byte integer in a register.
>
> Anyways, I don't think we need numbers if the primary concern is
> misaligned loads and stores. I just think that there has to be *some*
> lowering strategy that works in general for misaligned loads and stores and
> is no worse than calling memcpy.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150408/40faf362/attachment.html>


More information about the llvm-commits mailing list