[PATCH] Add a pass to transform aggregate loads and stores into scalar loads and stores.
listmail at philipreames.com
Wed Mar 4 22:58:33 PST 2015
On 03/04/2015 10:51 PM, Chandler Carruth wrote:
> On Wed, Mar 4, 2015 at 10:35 PM, Philip Reames
> <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote:
> On 03/03/2015 08:04 PM, Chandler Carruth wrote:
>> On Tue, Mar 3, 2015 at 9:24 AM, Philip Reames
>> <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote:
>> In http://reviews.llvm.org/D7780#133316, @chandlerc wrote:
>> > My primary question is why this isn't just an instcombine,
>> using insertvalue / extractvalue to reconnect the aggregates?
>> @chandlerc - Because we talked about that last time and it
>> was rejected for some reason I don't really remember. I
>> would strongly prefer we land this in something close to it's
>> current form and then iterate in tree. This has topic has
>> been stalled on census for months. Honestly, I sorta of
>> agree with you, but let's revisit that separately.
>> Absolutely not. I do *not* think we should add a pass if there
>> are multiple people who don't think it belongs as a pass.
>> If we keep losing the context of why this needs to be a pass,
>> then we need to fix that problem rather than just "iterate in
>> tree" to hide the problem. It won't get fixed otherwise, we'll
>> just end up with bad code in the tree.
> Ok. I think you're misinterpreting my comments, but let's debate
> that over a beer tomorrow, not here. It's too easy to talk past
> each other in email. I feel like you're letting perfection be the
> enemy of the good here, but again, over a beer tomorrow.
> My "ideal design" for the current patch *only* would be to move
> the handling to instcombine. Specifically, we should unwrap a
> load or a store of an FCA when that FCA contains exactly one type
> and the size of that type covers the entire size of the FCA. (To
> be utterly clear, this code would involve no recursion!)
> After that has landed, I would likely to see instcombine extended
> to handle FCAs with multiple fields. I'm a bit unsure about this
> given that it appears to loose dereferenceability information, but
> given previously expressed desire to not change GVN to support
> FCAs, this seems like a workable approach.
> I mean, happy to debate, but I'm not even sure we're on a different
> page here. The stages you describe seem completely good to me.
> Notably, making the changes incremental and small without backtracking
> in terms of design or adding and then removing unneeded
> infrastructure. But anyways...
I don't think we are in disagreement here. I didn't mean to imply we
were beyond the initial "iterate in tree" vs "get it right the first
> (As a reminder, I am not personally using FCAs in any form. My
> only motivation for getting involved here is that we have had
> *multiple* bug reports on this topic with patches from interested
> parties and we have made *no* progress on resolving the underling
> issue. I see that as a problem.)
>> Ok, so what I recall about this is that the problem was that
>> these were not single-instruction replacements because there was
>> the desire to not introduce insertvalue / extractvalue? Am I on
>> the right path or not?
> Here's the quote from you in an earlier email thread titled "Add
> basic support for removal of load that are fed by a store of an
> Ah, of course. I had forgotten that this all started with GVN, which
> is why I never even considered this email thread when searching...
> "The store that remains has nothing to do with an alloca. It is
> just a store off to wild memory as an FCA. SROA shouldn't be
> touching it, and I don't think we want to try to teach the entire
> optimizer about FCAs.
> You're teaching GVN about FCAs in this patch, but we also reason
> about store-to-load forwarding in *many* other places. I don't
> think it is really feasible to teach every part of LLVM this.
> The alternative is to define the canonical form as extracting the
> values from the FCA and storing them individually. SROA does this
> for loads and stores into allocas as a matter of correctness, but
> we could also teach instcombine to do this for all FCA loads and
> stores as a matter of canonical form and optimization. I think
> that is probably the right direction long term, but I'm a little
> scared of the down-stream rammifications.
> The patch is trivial, and the code is already in SROA. Let me
> factor it out and I can post a patch to see if it also solves your
> problems. But I want to spend some time looking at what knock-on
> effects this has and whether they are reasonable."
> -- end quote
> If I'm reading you right here, you were essentially asking for an
> instcombine which replaced the FCA store with two i8 stores of the
> individual components.
> Yes, that was my thought.
> In yet another thread, which I can't seem to find now, you said
> something about not wanting this (or maybe it was the related
> memcpy thing? Not sure) in instcombine.
> I keep looking for this too as some of the context seems to have
> slipped my mind.... But I'm slowly teasing it back together.
> I believe that's what led directly to the current patch. If
> you're okay with getting this in InstCombine, let's do so.
> Indeed. So there are issues with splitting up an FCA store into
> multiple smaller stores. They are the same issues as with all other
> store splitting -- it loses the fact that a store was valid in the
> memory model (IE, doesn't race). But this seems a very minor concern
> (to me) for FCAs. We should make sure it doesn't lose other
> information that is significant (your comment about dereferencability
> is what I'm thinking about) but otherwise I don't see any specific
> problems here.
Ok, I think we're in agreement here. Let's move the "interesting" bits
of the proposed patch over to being a patch for instcombine. I agree
that the load legality and dereferenceability issues are secondary. We
probably want to document that with a comment, but that's not even the
first patch, so let's defer it.
deadalnix - Do you want to do this step? I could understand your being
a bit frustrated at this point. If not, I'm happy to handle getting
this ported over and reviewed. Just let me know. I really want to see
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-commits