[PATCH] D105265: [X86] AVX512FP16 instructions enabling 3/6
Pengfei Wang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 12 19:15:05 PDT 2021
pengfei added inline comments.
================
Comment at: clang/lib/Headers/avx512fp16intrin.h:1748
+
+#define _mm_cvt_roundsh_i32(A, R) \
+ (int)__builtin_ia32_vcvtsh2si32((__v8hf)(A), (int)(R))
----------------
LuoYuanke wrote:
> Does it also return i32 in x86_64 platform? We may unify the intrinsic both for x86 and x86_x64 to return i32.
Yes. This is used for both x86 and x86_x64.
================
Comment at: clang/lib/Headers/avx512fp16intrin.h:1874
+
+static __inline__ __m512 __DEFAULT_FN_ATTRS512 _mm512_cvtxph_ps(__m256h __A) {
+ return (__m512)__builtin_ia32_vcvtph2psx512_mask(
----------------
LuoYuanke wrote:
> VCVTPH2PSX support broadcast compared to VCVTPH2PS, but for intrinsics there is no difference. Do we need to add the new intrinsics? Ditto for its variants.
Yes. The difference is the type. We previously use `__m256i` for the half vector since `_Float16` is not a legal type then.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D105265/new/
https://reviews.llvm.org/D105265
More information about the llvm-commits
mailing list