[LLVMbugs] [Bug 10872] New: [loop-idiom] GVN fails to remove loads after loop-idiom recognition
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Tue Sep 6 10:30:51 PDT 2011
http://llvm.org/bugs/show_bug.cgi?id=10872
Summary: [loop-idiom] GVN fails to remove loads after
loop-idiom recognition
Product: libraries
Version: trunk
Platform: PC
OS/Version: All
Status: ASSIGNED
Severity: normal
Priority: P
Component: Scalar Optimizations
AssignedTo: resistor at mac.com
ReportedBy: atrick at apple.com
CC: llvmbugs at cs.uiuc.edu
Test case: SingleSource/Benchmarks/Stanford test.simple.Puzzle on A9 is 14%
slower with -unroll-scev.
Running with -O3 optimizes to O3.ll (0.54s)
-unroll-scev produces O3-unroll-scev.ll (0.64s)
These runs were using r138990. -unroll-scev will soon be default, but
-disable-unroll-scev will be available.
llc -mcpu=cortex-a9 -relocation-model=pic -disable-fp-elim
-disable-non-leaf-fp-elim O3.ll
-unroll-scev exposes more opportunities for memset_pattern, resulting in:
call void @memset_pattern16(i8* bitcast (i32* getelementptr inbounds ([13 x
[512 x i32]]* @p, i32 0, i32 6, i32 0) to i8*), i8* bitcast ([4 x i32]*
@.memset_pattern3 to i8*), i32 12) nounwind
This is fine, but then GVN fails to remove the subsequent loads:
%tmp2.i = load i32* getelementptr inbounds ([13 x i32]* @piecemax, i32 0, i32
0), align 4, !tbaa !3
for.end.i: ; preds = %for.inc.i8,
%if.then
%tmp14.i = load i32* getelementptr inbounds ([13 x i32]* @class, i32 0, i32
0), align 4, !tbaa !3
Removing %tmp2.i exposes lots of constant folding, but it is the removal of
%tmp14.i that speeds up the benchmark.
Also rdar://10065079
--
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
More information about the llvm-bugs
mailing list