[LLVMdev] RFC: Missing canonicalization in LLVM
Daniel Stewart
stewartd at codeaurora.org
Tue Apr 21 09:18:53 PDT 2015
So this change did indeed have an effect! :)
I’m seeing regressions in a number of benchmarks mainly due to a host of extra bitcasts that get introduced. Here’s the problem I’m seeing in a nutshell:
1) There is a Phi with input type double
2) Polly demotes the phi into a load/store of type double
3) InstCombine canonicalizes the load/store to use i64 instead of double
4) SROA removes the load/store & inserts a phi back in, using i64 as the type. Inserts bitcast to get to double.
5) The bitcast sticks around and eventually get translated into FMOVs (for AArch64 at least).
The function findCommonType() in SROA.cpp is used to obtain the type that should be used for the new alloca that SROA wants to create. It’s decision process is essentially – if all loads/stores of alloca are the same, use that type; else use the corresponding integer type. This causes bitcasts to be inserted in a number of places, most all of which stick around.
I’ve copied a reduced version of an instance of the problem below. I’m looking for comments on what others think is the right solution here. Make SROA more intelligent about picking the type?
The code is below with all unnecessary code removed for easy consumption.
Daniel
Before Polly – Prepare code for polly we have code that looks like:
while.cond473: ; preds = %while.cond473.outer78, %while.body475
%p_j_x452.0 = phi double [ %105, %while.body475 ], [ %p_j_x452.0.ph82, %while.cond473.outer78 ]
while.body475: ; preds = %while.cond473
%sub480 = fsub fast double %64, %p_j_x452.0
%105 = load double* %x485, align 8, !tbaa !25
After Polly – Prepare code for polly we have:
while.cond473: ; preds = %while.cond473.outer78, %while.body475
%p_j_x452.0.reload = load double* %p_j_x452.0.reg2mem
while.body475: ; preds = %while.cond473
%sub480 = fsub fast double %64, %p_j_x452.0.reload
%110 = load double* %x485, align 8, !tbaa !25
store double %110, double* %p_j_x452.0.reg2mem
After Combine redundant instructions :
while.cond473: ; preds = %while.cond473.outer78, %while.body475
%p_j_x452.0.reload = load double* %p_j_x452.0.reg2mem, align 8
while.body475: ; preds = %while.cond473
%sub480 = fsub fast double %74, %p_j_x452.0.reload
%x485 = getelementptr inbounds %struct.CompAtom* %15, i64 %idxprom482, i32 0, i32 0
%194 = bitcast double* %x485 to i64*
%195 = load i64* %194, align 8, !tbaa !25
%200 = bitcast double* %p_j_x452.0.reg2mem to i64*
store i64 %195, i64* %200, align 8
After SROA :
while.cond473: ; preds = %while.cond473.outer78, %while.body475
%p_j_x452.0.reg2mem.sroa.0.0.p_j_x452.0.reload362 = phi i64 [ %p_j_x452.0.ph73.reg2mem.sroa.0.0.load368, %while.cond473.outer78 ], [ %178, %while.body475 ]
%173 = bitcast i64 %p_j_x452.0.reg2mem.sroa.0.0.p_j_x452.0.reload362 to double
while.body475: ; preds = %while.cond473
%sub480 = fsub fast double %78, %173
%x485 = getelementptr inbounds %struct.CompAtom* %15, i64 %idxprom482, i32 0, i32 0
%177 = bitcast double* %x485 to i64*
%178 = load i64* %177, align 8, !tbaa !25
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Chandler Carruth
Sent: Wednesday, January 21, 2015 8:32 PM
To: Pete Cooper
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] RFC: Missing canonicalization in LLVM
On Wed, Jan 21, 2015 at 3:06 PM, Pete Cooper <peter_cooper at apple.com <mailto:peter_cooper at apple.com> > wrote:
Sounds good to me. Integers it is then.
FYI, thanks, I'm just going to commit this then. It seems we're all in essential agreement. We can revert it and take a more cautious approach if something terrible happens. =]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150421/efc76068/attachment.html>
More information about the llvm-dev
mailing list