[PATCH] D37668: [X86][intrinsics] lower _mm[256|512]_mask[z]_set1_epi[8|16|32|64] intrinsic to IR

Craig Topper via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Sep 12 00:30:21 PDT 2017


craig.topper added inline comments.


================
Comment at: include/clang/Basic/BuiltinsX86.def:981
-TARGET_BUILTIN(__builtin_ia32_pbroadcastd512_gpr_mask, "V16iiV16iUs", "", "avx512f")
 TARGET_BUILTIN(__builtin_ia32_pbroadcastq512_mem_mask, "V8LLiLLiV8LLiUc", "", "avx512f")
 TARGET_BUILTIN(__builtin_ia32_loaddqusi512_mask, "V16iiC*V16iUs", "", "avx512f")
----------------
I think you patch removed the only use of __builtin_ia32_pbroadcastq512_mem_mask right? Does your change work properly in 32-bit mode?


================
Comment at: lib/Headers/avx512bwintrin.h:2031
 {
-  return (__m512i) __builtin_ia32_pbroadcastb512_gpr_mask (__A,
-                 (__v64qi) __O,
-                 __M);
+  __m512i __V = _mm512_set1_epi8(__A);
+  return (__m512i) __builtin_ia32_selectb_512(__M,(__v64qi)__V,(__v64qi) __O);
----------------
We usually don't declare variables in the intrinsics if we can avoid it. Just nest the calls.


================
Comment at: test/CodeGen/avx512vl-builtins.c:4511
   // CHECK-LABEL: @test_mm256_mask_set1_epi32
-  // CHECK: @llvm.x86.avx512.mask.pbroadcast.d.gpr.256
+    // CHECK:  insertelement <8 x i32> undef, i32 %{{.*}}, i32 0
+  // CHECK:  insertelement <8 x i32> %{{.*}}, i32 %{{.*}}, i32 1
----------------
The first line is over indented


================
Comment at: test/CodeGen/avx512vl-builtins.c:4525
   // CHECK-LABEL: @test_mm256_maskz_set1_epi32
-  // CHECK: @llvm.x86.avx512.mask.pbroadcast.d.gpr.256
+    // CHECK:  insertelement <8 x i32> undef, i32 %{{.*}}, i32 0
+  // CHECK:  insertelement <8 x i32> %{{.*}}, i32 %{{.*}}, i32 1
----------------
The first line is overindented


https://reviews.llvm.org/D37668





More information about the cfe-commits mailing list