[llvm-bugs] [Bug 33674] New: [AVX-512] select + add can produce better sequence
via llvm-bugs
llvm-bugs at lists.llvm.org
Sun Jul 2 06:29:01 PDT 2017
https://bugs.llvm.org/show_bug.cgi?id=33674
Bug ID: 33674
Summary: [AVX-512] select + add can produce better sequence
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: enhancement
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: elena.demikhovsky at intel.com
CC: llvm-bugs at lists.llvm.org
This is the simplified C-code:
if (B[i] > 1)
Sum += A[i];
%8 = load <16 x i32>, <16 x i32>* %7, align 4, !dbg !27, !tbaa !30
%9 = icmp sgt <16 x i32> %8, <i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32
1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1, i32 1>
%10 = getelementptr inbounds i32, i32* %0, i64 %4
%11 = bitcast i32* %10 to <16 x i32>*
%12 = call <16 x i32> @llvm.masked.load.v16i32.p0v16i32(<16 x i32>* %11, i32
4, <16 x i1> %9, <16 x i32> undef)
%13 = select <16 x i1> %9, <16 x i32> %12, <16 x i32> zeroinitializer
%14 = add nsw <16 x i32> %5, %13
This code generates the following sequence:
vmovdqu32 zmm2, zmmword ptr [rsi + rax]
vpcmpgtd k1, zmm2, zmm0
vmovdqu32 zmm2 {k1} {z}, zmmword ptr [rdi + rax]
vmovdqa32 zmm2 {k1} {z}, zmm2
vpaddd zmm1, zmm1, zmm2
The better sequence:
vpcmpd k1, zmm3, ZMMWORD PTR [rsi+rcx*4], 1
vpaddd zmm4{k1}, zmm4, ZMMWORD PTR [rdi+rcx*4]
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20170702/f055b99f/attachment.html>
More information about the llvm-bugs
mailing list