[llvm-dev] Aggregate load/stores

Sun Aug 16 23:21:05 PDT 2015

On Sun, Aug 16, 2015 at 10:27 PM, deadal nix <deadalnix at gmail.com> wrote:

>
>
> 2015-08-16 22:10 GMT-07:00 David Majnemer <david.majnemer at gmail.com>:
>>
>>
>> I would argue that a fix in the wrong direction is worse than the status
>> quo.
>>
>
> How is proposed change worse than status quo ?
>

Because a solution which doesn't generalize is not a very powerful
solution.  What happens when somebody says that they want to use atomics +
large aggregate loads and stores? Give them yet another, different answer?
That would mean our earlier, less general answer, approach was either a
bandaid (bad) or the new answer requires a parallel code path in their
frontend (worse).

>
>
>>
>>
>>>
>>> The argument that target are relying on InstCombine to mitigate IR
>>> requiring legalization seems dubious to me. First, because both aggregate
>>> and large scalar require legalization, so, if not ideal, the proposed
>>> change does not makes things any worse than they already are. In fact, as
>>> far as legalization is concerned, theses are pretty much the same. It
>>> should also be noted that InstCombine is not guaranteed to run before the
>>> target, so it seems like a bad idea to me to rely on it in the backend.
>>>
>>
>> InstCombine is not guaranteed to run before IR hits the backend but the
>> result of legalizing the machinations of InstCombine's output during
>> SelectionDAG is worse than generating illegal IR in the first place.
>>
>
> That does not follow. InstCombine is not creating new things that require
> legalisation, it changes one thing that require legalization into another
> that a larger part of LLVM can understand.
>

I'm afraid I don't understand what you are getting at here.  InstCombine
carefully avoids ptrtoint to weird types, truncs to weird types, etc. when
creating new IR.

>
>
>>
>>
>>>
>>> As for the big integral thing, I really don't care. I can change it to
>>> create multiple loads/stores respecting data layout, I have the code for
>>> that and could adapt it for this PR without too much trouble. If this is
>>> the only thing that is blocking this PR, then we can proceed. But I'd like
>>> some notion that we are making progress. Would you be willing to accept a
>>> solution based on creating a serie of load/store respecting the datalayout ?
>>>
>>
>> Splitting the memory operation into smaller operations is not semantics
>> preserving from an IR-theoretic perspective.  For example, splitting a
>> volatile memory operation into several volatile memory operations is not
>> OK.  Same goes with atomics.  Some targets provide atomic memory operations
>> at the granularity of a cache line and splitting at legal integer
>> granularity would be observably different.
>>
>>
> That is off topic. Proposed patch explicitly gate for this.
>

Then I guess we agree to disagree about what is "on topic".  I think that
our advice to frontend authors regarding larger-than-legal loads/stores
should be uniform and not dependent on whether or not the operation was or
was not volatile.

>
>
>>
>> With the above in mind, I don't see it as unreasonable for frontends to
>> generate IR that LLVM is comfortable with.  We seem fine telling frontend
>> authors that they should strive to avoid large aggregate memory operations
>> in our performance tips guide <
>> http://llvm.org/docs/Frontend/PerformanceTips.html#avoid-loads-and-stores-of-large-aggregate-type>.
>> Implementation experience with Clang hasn't shown this to be particularly
>> odious to follow and none of the LLVM-side solutions seem satisfactory.
>>
>>
> Most front end do not have clang resources. Additionally, this tip is not
> quite accurate. I'm not interested in large aggregate load/store at this
> stage. I'm interested in ANY aggregate load/store. LLVM is just unable to
> handle any of it in a way that make sense. It could certainly do better for
> small aggregate, without too much trouble.
>
>
I'm confused what you mean about "clang resources" here, you haven't made
it clear what the burden it is to your frontend. I'm not saying that there
isn't such a burden, I just haven't seen it been articulated and I have
heard nothing similar from other folks using LLVM.  What prevents you from
performing field-at-a-time loads and stores or calls to the memcpy
intrinsic?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150816/9468d4d3/attachment.html>