[PATCH] D150913: [Clang][Bfloat16] Upgrade __bf16 to arithmetic type, change mangling, and extend excess precision support.

Fri May 19 07:53:16 PDT 2023

pengfei added a comment.

Great work! Thanks for the patch!

================
Comment at: clang/include/clang/AST/ASTContext.h:1102
   CanQualType HalfTy; // [OpenCL 6.1.1.1], ARM NEON
-  CanQualType BFloat16Ty;
+  CanQualType BFloat16Ty; // ISO/IEC/IEEE 60559.
   CanQualType Float16Ty; // C11 extension ISO/IEC TS 18661-3
----------------
Don't have a look at ISO/IEC/IEEE 60559, but I doubt BF16 is still not a IEEE type for now.

================
Comment at: clang/lib/Basic/Targets/AMDGPU.h:121
   bool hasBFloat16Type() const override { return isAMDGCN(getTriple()); }
-  const char *getBFloat16Mangling() const override { return "u6__bf16"; };
+  const char *getBFloat16Mangling() const override { return "DF16b"; };

----------------
I think it's time to bring D139608 back with this patch :)

================
Comment at: clang/lib/Basic/Targets/X86.cpp:362-363
       HasX87 = true;
+    } else if (Feature == "+fullbf16") {
+      HasFullBFloat16 = true;
     }
----------------
Maybe not need it.

================
Comment at: clang/lib/Basic/Targets/X86.cpp:381

     HasBFloat16 = SSELevel >= SSE2;

----------------
I'm not sure if I understand the meaning of `HasFullBFloat16`. If it is used for target that supports arithmetic `__bf16`, we should not use `+fullbf16` but always enable it for SSE2, i.e., `HasFullBFloat16 = SSELevel >= SSE2`. Because X86 GCC already supports arithmetic for `__bf16`.

If this is used in the way like `HasLegalHalfType`, we should enable it once we have a full BF16 ISA on X86. `fullbf16` doesn't make much sense to me.

================
Comment at: clang/lib/Basic/Targets/X86.cpp:1122
       .Case("xsaveopt", HasXSAVEOPT)
+      .Case("fullbf16", HasFullBFloat16)
       .Default(false);
----------------
ditto.

================
Comment at: clang/test/CodeGen/X86/bfloat16.cpp:2-3
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -S -emit-llvm %s -o - | FileCheck -check-prefix=CHECK-NBF16 %s
+
----------------
The backend has already support lowering of `bfloat`, I don't think it's necessary to do extra work in FE unless for excess-precision.

================
Comment at: clang/test/CodeGen/X86/fexcess-precision-bfloat16.c:7
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
+// RUN: -fbfloat16-excess-precision=fast -target-feature +fullbf16 \
+// RUN: -emit-llvm -o - %s | FileCheck -check-prefixes=CHECK-NO-EXT %s
----------------
The tests here make me guess you want to use `fullbf16` the same as `HasLegalHalfType`.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150913/new/

https://reviews.llvm.org/D150913