[PATCH] D105265: [X86] AVX512FP16 instructions enabling 3/6

LuoYuanke via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 12 06:29:41 PDT 2021


LuoYuanke added inline comments.


================
Comment at: clang/lib/Headers/avx512fp16intrin.h:1748
+
+#define _mm_cvt_roundsh_i32(A, R)                                              \
+  (int)__builtin_ia32_vcvtsh2si32((__v8hf)(A), (int)(R))
----------------
Does it also return i32 in x86_64 platform? We may unify the intrinsic both for x86 and x86_x64 to return i32.


================
Comment at: clang/lib/Headers/avx512fp16intrin.h:1874
+
+static __inline__ __m512 __DEFAULT_FN_ATTRS512 _mm512_cvtxph_ps(__m256h __A) {
+  return (__m512)__builtin_ia32_vcvtph2psx512_mask(
----------------
VCVTPH2PSX support broadcast compared to VCVTPH2PS, but for intrinsics there is no difference. Do we need to add the new intrinsics? Ditto for its variants.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105265/new/

https://reviews.llvm.org/D105265



More information about the llvm-commits mailing list