[PATCH] D105263: [X86] AVX512FP16 instructions enabling 1/6

Wed Aug 4 07:02:01 PDT 2021

LuoYuanke added inline comments.

================
Comment at: clang/lib/CodeGen/TargetInfo.cpp:3405
+/// half member at the specified offset.  For example, {int,{half}} has a
+/// float at offset 4.  It is conservatively correct for this routine to return
+/// false.
----------------
float -> half?

================
Comment at: clang/lib/Headers/avx512fp16intrin.h:292
+
+  return (__m128h)__builtin_ia32_loadsh128_mask((__v8hf *)__A, src, __U & 1);
+}
----------------
Just be curious, why not directly use __W?

================
Comment at: clang/lib/Headers/avx512fp16intrin.h:319
+    __m512h_u __v;
+  } __attribute__((__packed__, __may_alias__));
+  return ((const struct __loadu_ph *)__p)->__v;
----------------
What is __may_alias__ used for?

================
Comment at: clang/lib/Headers/avx512fp16intrin.h:350
+                                                               __m128h __A) {
+  __builtin_ia32_storesh128_mask((__v8hf *)__W, __A, __U & 1);
+}
----------------
I see in _mm_mask_load_sh(), we create a __m128h with upper bits zero, not sure we also need it in store intrinsic.

================
Comment at: clang/lib/Headers/avx512fp16intrin.h:419
+static __inline__ short __DEFAULT_FN_ATTRS128 _mm_cvtsi128_si16(__m128i __a) {
+  __v8hi __b = (__v8hi)__a;
+  return __b[0];
----------------
Why not return __a[0] directly?

================
Comment at: clang/test/CodeGen/X86/avx512fp16-abi.c:89
+  _Float16 a;
+  float b;
+};
----------------
Any false test case that have padding between a and b?

================
Comment at: llvm/include/llvm/IR/Intrinsics.td:315
 def llvm_v8f16_ty      : LLVMType<v8f16>;    //  8 x half (__fp16)
+def llvm_v16f16_ty     : LLVMType<v16f16>;   // 16 x half (__fp16)
+def llvm_v32f16_ty     : LLVMType<v32f16>;   // 32 x half (__fp16)
----------------
Not sure about the legacy comments, should it be _Float16 now?

================
Comment at: llvm/include/llvm/Target/TargetSelectionDAG.td:1054
+def extloadvf16 : PatFrag<(ops node:$ptr), (extload node:$ptr)> {
+  let IsLoad = 1;
+  let ScalarMemoryVT = f16;
----------------
I notice it is true for other extload. Is it same to "true"?

================
Comment at: llvm/lib/Target/X86/Disassembler/X86Disassembler.cpp:341
     if ((insn->mode == MODE_64BIT || (byte1 & 0xc0) == 0xc0) &&
-        ((~byte1 & 0xc) == 0xc) && ((byte2 & 0x4) == 0x4)) {
+        ((~byte1 & 0x8) == 0x8) && ((byte2 & 0x4) == 0x4)) {
       insn->vectorExtensionType = TYPE_EVEX;
----------------
This is the same to ((byte1 & 0x8) == 0x0)?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105263/new/

https://reviews.llvm.org/D105263