r274110 - [AVX512] Zero extend cmp intrinsic return value.

Mon Jul 4 02:55:45 PDT 2016

Thanks for working on this!

Regards,
Igor

From: Craig Topper [mailto:craig.topper at gmail.com]
Sent: Monday, July 04, 2016 10:20
To: Breger, Igor
Cc: Eric Christopher via cfe-commits; Demikhovsky, Elena; Ouriel, Boaz
Subject: Re: r274110 - [AVX512] Zero extend cmp intrinsic return value.

I've modified the indices we use for the zero vector in r274484. This gives better codegen as it now looks like a concat to the backend. So now we just end up emitting unnecessary kshiftlw/kshiftrw pairs instead of converting to a wider vector op and back.

~Craig

On Sun, Jul 3, 2016 at 1:32 PM, Craig Topper <craig.topper at gmail.com<mailto:craig.topper at gmail.com>> wrote:
Also should we change the AutoUpgrade code for the cmp intrinsics in the backend to also use zero instead of undef?

~Craig

On Sun, Jul 3, 2016 at 12:11 AM, Breger, Igor <igor.breger at intel.com<mailto:igor.breger at intel.com>> wrote:
Hello Craig,
Thanks a lot for pointing it out to me.  I familiar with this problem,  we are planning  to improve CodeGen to handle with this case in near future.

Regards,
Igor

From: Craig Topper [mailto:craig.topper at gmail.com<mailto:craig.topper at gmail.com>]
Sent: Saturday, July 02, 2016 08:46
To: Breger, Igor; Eric Christopher via cfe-commits
Subject: Re: r274110 - [AVX512] Zero extend cmp intrinsic return value.

This change codgens to something really awful now. Can you take a look?

            .section            __TEXT,__text,regular,pure_instructions
            .section            __TEXT,__literal8,8byte_literals
            .p2align           3
LCPI0_0:
            .quad   -1
            .section            __TEXT,__const
            .p2align           6
LCPI0_1:
            .quad   0
            .quad   1
            .quad   2
            .quad   3
            .quad   8
            .quad   8
            .quad   8
            .quad   8
            .section            __TEXT,__text,regular,pure_instructions
            .globl   _test_mm_cmpeq_epu32_mask
            .p2align           4, 0x90
_test_mm_cmpeq_epu32_mask:
            vpcmpeqd       %xmm1, %xmm0, %k1
            vpbroadcastq   LCPI0_0(%rip), %zmm0 {%k1} {z}
            vpxord %zmm1, %zmm1, %zmm1
            vmovdqa64     LCPI0_1(%rip), %zmm2
            vpermt2q         %zmm1, %zmm2, %zmm0
            vpsllq   $63, %zmm0, %zmm0
            vptestmq         %zmm0, %zmm0, %k0
            kmovw            %k0, %eax
            retq

~Craig

On Wed, Jun 29, 2016 at 1:14 AM, Igor Breger via cfe-commits <cfe-commits at lists.llvm.org<mailto:cfe-commits at lists.llvm.org>> wrote:
Author: ibreger
Date: Wed Jun 29 03:14:17 2016
New Revision: 274110

URL: http://llvm.org/viewvc/llvm-project?rev=274110&view=rev
Log:
[AVX512]  Zero extend cmp intrinsic return value.

Differential Revision: http://reviews.llvm.org/D21746

Modified:
    cfe/trunk/lib/CodeGen/CGBuiltin.cpp
    cfe/trunk/test/CodeGen/avx512vl-builtins.c

Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=274110&r1=274109&r2=274110&view=diff
==============================================================================

--- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Wed Jun 29 03:14:17 2016
@@ -6460,8 +6460,8 @@ static Value *EmitX86MaskedCompare(CodeG
       Indices[i] = i;
     for (unsigned i = NumElts; i != 8; ++i)
       Indices[i] = NumElts;
-    Cmp = CGF.Builder.CreateShuffleVector(Cmp, UndefValue::get(Cmp->getType()),
-                                          Indices);
+    Cmp = CGF.Builder.CreateShuffleVector(
+        Cmp, llvm::Constant::getNullValue(Cmp->getType()), Indices);
   }
   return CGF.Builder.CreateBitCast(Cmp,
                                    IntegerType::get(CGF.getLLVMContext(),

Modified: cfe/trunk/test/CodeGen/avx512vl-builtins.c
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/avx512vl-builtins.c?rev=274110&r1=274109&r2=274110&view=diff
==============================================================================
--- cfe/trunk/test/CodeGen/avx512vl-builtins.c (original)
+++ cfe/trunk/test/CodeGen/avx512vl-builtins.c Wed Jun 29 03:14:17 2016
@@ -8,6 +8,7 @@
 __mmask8 test_mm_cmpeq_epu32_mask(__m128i __a, __m128i __b) {
   // CHECK-LABEL: @test_mm_cmpeq_epu32_mask
   // CHECK: icmp eq <4 x i32> %{{.*}}, %{{.*}}
+  // CHECK: shufflevector <4 x i1> %{{.*}}, <4 x i1> zeroinitializer, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 4, i32 4, i32 4>
   return (__mmask8)_mm_cmpeq_epu32_mask(__a, __b);
 }



_______________________________________________
cfe-commits mailing list
cfe-commits at lists.llvm.org<mailto:cfe-commits at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.


---------------------------------------------------------------------
Intel Israel (74) Limited

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20160704/f47f8441/attachment-0001.html>