Add a pass to convert aggregate loads/stores into scalar load stores

Hal Finkel hfinkel at anl.gov
Thu Oct 30 14:13:53 PDT 2014


----- Original Message -----
> From: "Philip Reames" <listmail at philipreames.com>
> To: "Chandler Carruth" <chandlerc at google.com>, "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>
> Sent: Thursday, October 30, 2014 4:08:09 PM
> Subject: Re: Add a pass to convert aggregate loads/stores into scalar load stores
> 
> 
> 
> On 10/28/2014 09:27 AM, Chandler Carruth wrote:
> 
> On Tue, Oct 28, 2014 at 6:55 AM, Hal Finkel < hfinkel at anl.gov >
> wrote:
> 
> 
> 
> Chandler,
> 
> Can you please look at this? How do you think we should canonicalize
> this (is this the right approach)?
> Oof... yea.
> 
> 
> I wish I were more confident of what the "right" answer is here any
> more. =[ At the dev conference, Hal and I talked about a couple of
> options. I'm sure which is the "right" one, so let me lay out what I
> remember of that discussion.
> 
> Option 1 - Leave the load alone, improve GVN
> - not necessarily a bad option, but slightly risky in that it
> involves changes to key infrastructure with little in tree
> motivation
> - the original change doing this was rejected
> 
> Option 2 - Transform to load of component element types
> - tricky to get layout exactly right, but definitely doable
> 
> Option 3 - Transform to load of iN where N is sizeof(agg)*8.
> 
> Option 4 - Transform to series of smaller integer loads
> - This appears to be what this patch implements. Not entirely sure
> why this was chosen.
> 
> Option 5 - Transform to alloca and memcpy
> - Clang appears to emit loads of structs via an alloca (for the
> local) and a memcpy. The optimizer deconstructs this where
> appropriate.
> - I have no idea why Clang chose this option. My best guess is to
> exploit information about POD types?
> 
> Personally, I'd lean towards 5

Thanks for looking into this. I agree, 5 seems like the best option. That is what Clang does and thus what the existing infrastructure will be best tuned (and most motivated) to handle.

>,1,2 above (in roughly that order). 1 &
> 2 seem like better long term solutions, but 5 probably works fairly
> well today. I'm not really a fan of either 3 or 4 since we loose
> things like distinctions between pointers and integers. If we had to
> choose, I'd take 3 over 4.
> 
> I think we also discussed the trade off between a pass and an
> instcombine transform. I would lean towards making whichever option
> we chose a canonicalization rule in instcombine.
> 
> Also, this patch implements option 2 for a struct with a single
> element type which seems like a (independently) useful
> canonicalization. Should we introduce that transform as a
> canonicalization in instcombine? I'd lean towards that.

I agree this this too, both seem like useful canonicalizations, and suitable for instcombine.

 -Hal

> 
> Chandler, Hal - Thoughts, opinions?
> 
> 
> 
> 
> 
> 
> _______________________________________________
> llvm-commits mailing list llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-commits mailing list