[PATCH] D37668: [X86][intrinsics] lower _mm[256|512]_mask[z]_set1_epi[8|16|32|64] intrinsic to IR
Craig Topper via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Sep 12 00:30:21 PDT 2017
craig.topper added inline comments.
================
Comment at: include/clang/Basic/BuiltinsX86.def:981
-TARGET_BUILTIN(__builtin_ia32_pbroadcastd512_gpr_mask, "V16iiV16iUs", "", "avx512f")
TARGET_BUILTIN(__builtin_ia32_pbroadcastq512_mem_mask, "V8LLiLLiV8LLiUc", "", "avx512f")
TARGET_BUILTIN(__builtin_ia32_loaddqusi512_mask, "V16iiC*V16iUs", "", "avx512f")
----------------
I think you patch removed the only use of __builtin_ia32_pbroadcastq512_mem_mask right? Does your change work properly in 32-bit mode?
================
Comment at: lib/Headers/avx512bwintrin.h:2031
{
- return (__m512i) __builtin_ia32_pbroadcastb512_gpr_mask (__A,
- (__v64qi) __O,
- __M);
+ __m512i __V = _mm512_set1_epi8(__A);
+ return (__m512i) __builtin_ia32_selectb_512(__M,(__v64qi)__V,(__v64qi) __O);
----------------
We usually don't declare variables in the intrinsics if we can avoid it. Just nest the calls.
================
Comment at: test/CodeGen/avx512vl-builtins.c:4511
// CHECK-LABEL: @test_mm256_mask_set1_epi32
- // CHECK: @llvm.x86.avx512.mask.pbroadcast.d.gpr.256
+ // CHECK: insertelement <8 x i32> undef, i32 %{{.*}}, i32 0
+ // CHECK: insertelement <8 x i32> %{{.*}}, i32 %{{.*}}, i32 1
----------------
The first line is over indented
================
Comment at: test/CodeGen/avx512vl-builtins.c:4525
// CHECK-LABEL: @test_mm256_maskz_set1_epi32
- // CHECK: @llvm.x86.avx512.mask.pbroadcast.d.gpr.256
+ // CHECK: insertelement <8 x i32> undef, i32 %{{.*}}, i32 0
+ // CHECK: insertelement <8 x i32> %{{.*}}, i32 %{{.*}}, i32 1
----------------
The first line is overindented
https://reviews.llvm.org/D37668
More information about the cfe-commits
mailing list