[PATCH] D105263: [X86] AVX512FP16 instructions enabling 1/6

Fri Jul 9 22:54:12 PDT 2021

pengfei added inline comments.

================
Comment at: clang/lib/Headers/avx512vlfp16intrin.h:74
+static __inline__ __m256h __DEFAULT_FN_ATTRS256 _mm256_abs_ph(__m256h __A) {
+  return (__m256h)_mm256_and_epi32(_mm256_set1_epi32(0x7FFF7FFF), (__m256i)__A);
+}
----------------
craig.topper wrote:
> Why do we use _mm256_set1_epi32 instead of _mm256_set1_epi16?
There's no difference in assembly for immediate value. https://godbolt.org/z/sMbrM611d. But the latency of vpbroadcastd is better than vpbroadcastw in Skylake according to intrinsic guide. Here the only effect is consist with _mm256_and_epi32. Do you think it's better to use _mm256_set1_epi16?

================
Comment at: llvm/include/llvm/IR/RuntimeLibcalls.def:290
 HANDLE_LIBCALL(FPEXT_F16_F128, "__extendhftf2")
+HANDLE_LIBCALL(FPEXT_F16_F80, "__extendhfxf2")
 HANDLE_LIBCALL(FPEXT_F32_F64, "__extendsfdf2")
----------------
craig.topper wrote:
> Is this tested in this patch?
No. I'll move it to the 3rd patch and test it there.

================
Comment at: llvm/lib/Target/X86/X86FastISel.cpp:58
   bool X86ScalarSSEf32;
+  bool X86ScalarAVXf16;

----------------
craig.topper wrote:
> AVX here should maybe be AVX512, but maybe this is pointing out that this name is bad. Would X86ScalarXMMf* be better?
Maybe we can use X86ScalarSSEf16, here SSE means SSE registers? Especially GCC community proposing to support FP16 since SSE2.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105263/new/

https://reviews.llvm.org/D105263