[PATCH] D105265: [X86] AVX512FP16 instructions enabling 3/6

Pengfei Wang via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Aug 12 19:15:05 PDT 2021


pengfei added inline comments.


================
Comment at: clang/lib/Headers/avx512fp16intrin.h:1748
+
+#define _mm_cvt_roundsh_i32(A, R)                                              \
+  (int)__builtin_ia32_vcvtsh2si32((__v8hf)(A), (int)(R))
----------------
LuoYuanke wrote:
> Does it also return i32 in x86_64 platform? We may unify the intrinsic both for x86 and x86_x64 to return i32.
Yes. This is used for both x86 and x86_x64.


================
Comment at: clang/lib/Headers/avx512fp16intrin.h:1874
+
+static __inline__ __m512 __DEFAULT_FN_ATTRS512 _mm512_cvtxph_ps(__m256h __A) {
+  return (__m512)__builtin_ia32_vcvtph2psx512_mask(
----------------
LuoYuanke wrote:
> VCVTPH2PSX support broadcast compared to VCVTPH2PS, but for intrinsics there is no difference. Do we need to add the new intrinsics? Ditto for its variants.
Yes. The difference is the type. We previously use `__m256i` for the half vector since `_Float16` is not a legal type then.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105265/new/

https://reviews.llvm.org/D105265



More information about the cfe-commits mailing list