Add a pass to convert aggregate loads/stores into scalar load stores

Philip Reames listmail at philipreames.com
Fri Oct 31 12:03:59 PDT 2014


On 10/30/2014 10:11 PM, Hal Finkel wrote:
> ----- Original Message -----
>> From: "Reid Kleckner" <rnk at google.com>
>> To: "Hal J. Finkel" <hfinkel at anl.gov>
>> Cc: "Philip Reames" <listmail at philipreames.com>, "Chandler Carruth" <chandlerc at google.com>, "llvm-commits"
>> <llvm-commits at cs.uiuc.edu>
>> Sent: Thursday, October 30, 2014 7:34:17 PM
>> Subject: Re: Add a pass to convert aggregate loads/stores into scalar load stores
>>
>>
>> I don't like the memcpy approach because we introduce a new memory
>> temporary that wasn't present in the input LLVM IR, and then rely on
>> SROA to zap it for us later. I think we should skip that step and
>> transform aggregate-load+extractvalue directly into scalar loads.
> SROA may or may not zap it later, but SROA will also tend to transform things into wide loads, not individual ones.
>
> The underlying issue, as Chandler has pointed out several times in various contexts, is that once you break a load up into smaller loads, you often lose the knowledge that the entire range of bytes is accessible. This knowledge is important when we get to CodeGen because it allows us to use wider loads which are often more efficient. The canonical form should not lose this information, and so we don't want to split up the load as the canonical choice.
>
> Introducing an extra temporary does not particularly bother me, but the key thing is to stay as close as possible to the idiom that Clang is using because that is the idiom that I know we need to focus on optimizing well. Aligning our canonical form with the IR that Clang produces seems like the right thing to do in this case... it will maximally allow users of FCAs to benefit from the optimization philosophy employed by the rest of the system.
>
> Thanks again,
> Hal
I think at this point we're down to only 1 (extend GVN for FCA) and 5 
(alloca/memcpy idiom) being our viable choices.  Given how simple 5 is, 
I'm tempted to run with that for the moment, but 1 still seems like the 
better long term investment.

Philip




More information about the llvm-commits mailing list