Add a pass to convert aggregate loads/stores into scalar load stores

Philip Reames listmail at philipreames.com
Thu Oct 30 14:19:09 PDT 2014


On 10/30/2014 02:13 PM, Hal Finkel wrote:
> ----- Original Message -----
>> From: "Philip Reames" <listmail at philipreames.com>
>> To: "Chandler Carruth" <chandlerc at google.com>, "Hal Finkel" <hfinkel at anl.gov>
>> Cc: "llvm-commits" <llvm-commits at cs.uiuc.edu>
>> Sent: Thursday, October 30, 2014 4:08:09 PM
>> Subject: Re: Add a pass to convert aggregate loads/stores into scalar load stores
>>
>>
>>
>> On 10/28/2014 09:27 AM, Chandler Carruth wrote:
>>
>> On Tue, Oct 28, 2014 at 6:55 AM, Hal Finkel < hfinkel at anl.gov >
>> wrote:
>>
>>
>>
>> Chandler,
>>
>> Can you please look at this? How do you think we should canonicalize
>> this (is this the right approach)?
>> Oof... yea.
>>
>>
>> I wish I were more confident of what the "right" answer is here any
>> more. =[ At the dev conference, Hal and I talked about a couple of
>> options. I'm sure which is the "right" one, so let me lay out what I
>> remember of that discussion.
>>
>> Option 1 - Leave the load alone, improve GVN
>> - not necessarily a bad option, but slightly risky in that it
>> involves changes to key infrastructure with little in tree
>> motivation
>> - the original change doing this was rejected
>>
>> Option 2 - Transform to load of component element types
>> - tricky to get layout exactly right, but definitely doable
>>
>> Option 3 - Transform to load of iN where N is sizeof(agg)*8.
>>
>> Option 4 - Transform to series of smaller integer loads
>> - This appears to be what this patch implements. Not entirely sure
>> why this was chosen.
>>
>> Option 5 - Transform to alloca and memcpy
>> - Clang appears to emit loads of structs via an alloca (for the
>> local) and a memcpy. The optimizer deconstructs this where
>> appropriate.
>> - I have no idea why Clang chose this option. My best guess is to
>> exploit information about POD types?
>>
>> Personally, I'd lean towards 5
> Thanks for looking into this. I agree, 5 seems like the best option. That is what Clang does and thus what the existing infrastructure will be best tuned (and most motivated) to handle.
>
>> ,1,2 above (in roughly that order). 1 &
>> 2 seem like better long term solutions, but 5 probably works fairly
>> well today. I'm not really a fan of either 3 or 4 since we loose
>> things like distinctions between pointers and integers. If we had to
>> choose, I'd take 3 over 4.
>>
>> I think we also discussed the trade off between a pass and an
>> instcombine transform. I would lean towards making whichever option
>> we chose a canonicalization rule in instcombine.
>>
>> Also, this patch implements option 2 for a struct with a single
>> element type which seems like a (independently) useful
>> canonicalization. Should we introduce that transform as a
>> canonicalization in instcombine? I'd lean towards that.
> I agree this this too, both seem like useful canonicalizations, and suitable for instcombine.
>
>   -Hal
If Chandler agrees, I'm happy to help implement and/or review both of 
these changes in InstCombine.  If structured properly, these should be a 
few lines each.
>> Chandler, Hal - Thoughts, opinions?
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> llvm-commits mailing list llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>




More information about the llvm-commits mailing list