[LLVMbugs] [Bug 14657] New: Poor AVX code generation on 8xi1 masks
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Wed Dec 19 14:15:19 PST 2012
http://llvm.org/bugs/show_bug.cgi?id=14657
Bug #: 14657
Summary: Poor AVX code generation on 8xi1 masks
Product: libraries
Version: trunk
Platform: PC
OS/Version: All
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
AssignedTo: unassignedbugs at nondot.org
ReportedBy: nrotem at apple.com
CC: llvmbugs at cs.uiuc.edu
Classification: Unclassified
The loop below comes from the 'gcc-loops' benchmark and it is vectorized with
the Loop Vectorizer. Because of this problem we are 50% slower than GCC on this
loop.
When compiled with LLC the instruction "%14=and<8 x i1>..." becomes an AND of
an XMM register. This is due to the way we type-legalize vectors. I think that
the best way to solve this problem is to implement an x86-specific dag-combine
pattern to handle these cases.
define void @_Z9example25v() nounwind uwtable noinline ssp {
vector.ph:
br label %vector.body
vector.body: ; preds = %vector.body,
%vector.ph
%index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
%0 = getelementptr inbounds [1024 x float]* @da, i64 0, i64 %index
%1 = bitcast float* %0 to <8 x float>*
%2 = load <8 x float>* %1, align 16
%3 = getelementptr inbounds [1024 x float]* @db, i64 0, i64 %index
%4 = bitcast float* %3 to <8 x float>*
%5 = load <8 x float>* %4, align 16
%6 = fcmp olt <8 x float> %2, %5
%7 = getelementptr inbounds [1024 x float]* @dc, i64 0, i64 %index
%8 = bitcast float* %7 to <8 x float>*
%9 = load <8 x float>* %8, align 16
%10 = getelementptr inbounds [1024 x float]* @dd, i64 0, i64 %index
%11 = bitcast float* %10 to <8 x float>*
%12 = load <8 x float>* %11, align 16
%13 = fcmp olt <8 x float> %9, %12
%14 = and <8 x i1> %6, %13
%15 = zext <8 x i1> %14 to <8 x i32>
%16 = getelementptr inbounds [1024 x i32]* @dj, i64 0, i64 %index
%17 = bitcast i32* %16 to <8 x i32>*
store <8 x i32> %15, <8 x i32>* %17, align 16
%index.next = add i64 %index, 8
%18 = icmp eq i64 %index.next, 1024
br i1 %18, label %for.end, label %vector.body
for.end: ; preds = %vector.body
ret void
}
--
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
More information about the llvm-bugs
mailing list