[PATCH] Add a pass to transform aggregate loads and stores into scalar loads and stores.
chandlerc at google.com
Wed Mar 4 22:51:05 PST 2015
On Wed, Mar 4, 2015 at 10:35 PM, Philip Reames <listmail at philipreames.com>
> On 03/03/2015 08:04 PM, Chandler Carruth wrote:
> On Tue, Mar 3, 2015 at 9:24 AM, Philip Reames <listmail at philipreames.com>
>> In http://reviews.llvm.org/D7780#133316, @chandlerc wrote:
>> > My primary question is why this isn't just an instcombine, using
>> insertvalue / extractvalue to reconnect the aggregates?
>> @chandlerc - Because we talked about that last time and it was rejected
>> for some reason I don't really remember. I would strongly prefer we land
>> this in something close to it's current form and then iterate in tree.
>> This has topic has been stalled on census for months. Honestly, I sorta of
>> agree with you, but let's revisit that separately.
> Absolutely not. I do *not* think we should add a pass if there are
> multiple people who don't think it belongs as a pass.
> If we keep losing the context of why this needs to be a pass, then we
> need to fix that problem rather than just "iterate in tree" to hide the
> problem. It won't get fixed otherwise, we'll just end up with bad code in
> the tree.
> Ok. I think you're misinterpreting my comments, but let's debate that
> over a beer tomorrow, not here. It's too easy to talk past each other in
> email. I feel like you're letting perfection be the enemy of the good
> here, but again, over a beer tomorrow.
> My "ideal design" for the current patch *only* would be to move the
> handling to instcombine. Specifically, we should unwrap a load or a store
> of an FCA when that FCA contains exactly one type and the size of that type
> covers the entire size of the FCA. (To be utterly clear, this code would
> involve no recursion!)
> After that has landed, I would likely to see instcombine extended to
> handle FCAs with multiple fields. I'm a bit unsure about this given that
> it appears to loose dereferenceability information, but given previously
> expressed desire to not change GVN to support FCAs, this seems like a
> workable approach.
I mean, happy to debate, but I'm not even sure we're on a different page
here. The stages you describe seem completely good to me. Notably, making
the changes incremental and small without backtracking in terms of design
or adding and then removing unneeded infrastructure. But anyways...
> (As a reminder, I am not personally using FCAs in any form. My only
> motivation for getting involved here is that we have had *multiple* bug
> reports on this topic with patches from interested parties and we have made
> *no* progress on resolving the underling issue. I see that as a problem.)
> Ok, so what I recall about this is that the problem was that these were
> not single-instruction replacements because there was the desire to not
> introduce insertvalue / extractvalue? Am I on the right path or not?
> Here's the quote from you in an earlier email thread titled "Add basic
> support for removal of load that are fed by a store of an aggregate":
Ah, of course. I had forgotten that this all started with GVN, which is why
I never even considered this email thread when searching...
> "The store that remains has nothing to do with an alloca. It is just a
> store off to wild memory as an FCA. SROA shouldn't be touching it, and I
> don't think we want to try to teach the entire optimizer about FCAs.
> You're teaching GVN about FCAs in this patch, but we also reason about
> store-to-load forwarding in *many* other places. I don't think it is really
> feasible to teach every part of LLVM this.
> The alternative is to define the canonical form as extracting the values
> from the FCA and storing them individually. SROA does this for loads and
> stores into allocas as a matter of correctness, but we could also teach
> instcombine to do this for all FCA loads and stores as a matter of
> canonical form and optimization. I think that is probably the right
> direction long term, but I'm a little scared of the down-stream
> The patch is trivial, and the code is already in SROA. Let me factor it
> out and I can post a patch to see if it also solves your problems. But I
> want to spend some time looking at what knock-on effects this has and
> whether they are reasonable."
> -- end quote
> If I'm reading you right here, you were essentially asking for an
> instcombine which replaced the FCA store with two i8 stores of the
> individual components.
Yes, that was my thought.
> In yet another thread, which I can't seem to find now, you said
> something about not wanting this (or maybe it was the related memcpy
> thing? Not sure) in instcombine.
I keep looking for this too as some of the context seems to have slipped my
mind.... But I'm slowly teasing it back together.
> I believe that's what led directly to the current patch. If you're okay
> with getting this in InstCombine, let's do so.
Indeed. So there are issues with splitting up an FCA store into multiple
smaller stores. They are the same issues as with all other store splitting
-- it loses the fact that a store was valid in the memory model (IE,
doesn't race). But this seems a very minor concern (to me) for FCAs. We
should make sure it doesn't lose other information that is significant
(your comment about dereferencability is what I'm thinking about) but
otherwise I don't see any specific problems here.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits