[PATCH] D150913: [Clang][Bfloat16] Upgrade __bf16 to arithmetic type, change mangling, and extend excess precision support.

M. Zeeshan Siddiqui via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon May 22 13:48:58 PDT 2023


codemzs added inline comments.


================
Comment at: clang/lib/Basic/Targets/X86.cpp:362-363
       HasX87 = true;
+    } else if (Feature == "+fullbf16") {
+      HasFullBFloat16 = true;
     }
----------------
pengfei wrote:
> Maybe not need it.
Clarified on the other thread but if you have questions please feel free to post here and I will address them.


================
Comment at: clang/lib/Basic/Targets/X86.cpp:381
 
     HasBFloat16 = SSELevel >= SSE2;
 
----------------
pengfei wrote:
> zahiraam wrote:
> > codemzs wrote:
> > > rjmccall wrote:
> > > > pengfei wrote:
> > > > > I'm not sure if I understand the meaning of `HasFullBFloat16`. If it is used for target that supports arithmetic `__bf16`, we should not use `+fullbf16` but always enable it for SSE2, i.e., `HasFullBFloat16 = SSELevel >= SSE2`. Because X86 GCC already supports arithmetic for `__bf16`.
> > > > > 
> > > > > If this is used in the way like `HasLegalHalfType`, we should enable it once we have a full BF16 ISA on X86. `fullbf16` doesn't make much sense to me.
> > > > At the moment, we haven't done the work to emulate BFloat16 arithmetic in any of the three ways we can do that: Clang doesn't promote it in IRGen, LLVM doesn't promote it in legalization, and we don't have compiler-rt functions for it.  If we emit these instructions, they'll just sail through LLVM and fail in the backend.  So in the short term, we have to restrict this to targets that directly support BFloat16 arithmetic in hardware, which doesn't include x86.
> > > > 
> > > > Once we have that emulation support, I agree that the x86 targets should enable this whenever they would enable `__bf16`.
> > > @rjmccall, I concur and just wanted to confirm this change indeed intends to provide `BFloat16` emulation support, utilizing excess precision for promotion to `float`. The `HasFullBFloat16` switch is designed to determine excess precision support automatically when the hardware does not natively support `bfloat16` arithmetic.
> > > At the moment, we haven't done the work to emulate BFloat16 arithmetic in any of the three ways we can do that: Clang doesn't promote it in IRGen, LLVM doesn't promote it in legalization, and we don't have compiler-rt functions for it.  If we emit these instructions, they'll just sail through LLVM and fail in the backend.  So in the short term, we have to restrict this to targets that directly support BFloat16 arithmetic in hardware, which doesn't include x86.
> > > 
> > > Once we have that emulation support, I agree that the x86 targets should enable this whenever they would enable `__bf16`.
> > 
> > Would be nice to add a comment to clarify it. 
> > LLVM doesn't promote it in legalization, and we don't have compiler-rt functions for it.
> 
> That's not true: https://godbolt.org/z/jxf5E83vG.
> 
> > The `HasFullBFloat16` switch is designed to determine excess precision support automatically when the hardware does not natively support bfloat16 arithmetic.
> 
> Makes sense to me.
@pengfei, you're right. As part of D126953, the x86 backend received `bfloat16` emulation support. Also, I hope my explanation about the `HasFullBFloat16` flag addressed your questions. Please let me know if further clarification/change is needed.


================
Comment at: clang/test/CodeGen/X86/bfloat16.cpp:2-3
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -target-feature +fullbf16 -S -emit-llvm %s -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -S -emit-llvm %s -o - | FileCheck -check-prefix=CHECK-NBF16 %s
+
----------------
zahiraam wrote:
> pengfei wrote:
> > The backend has already support lowering of `bfloat`, I don't think it's necessary to do extra work in FE unless for excess-precision.
> > The backend has already support lowering of `bfloat`, I don't think it's necessary to do extra work in FE unless for excess-precision.
> 
> +1.
@pengfei @zahiraam I added this test to verify bfloat16 IR gen functionality, considering both scenarios: with and without native bfloat16 support. However, if you believe it's more beneficial to omit it, I'm open to doing so. Happy to also move this test to another target that doesn't have backend support for emulation. 


================
Comment at: clang/test/CodeGen/X86/fexcess-precision-bfloat16.c:7
+// RUN: %clang_cc1 -triple x86_64-unknown-unknown \
+// RUN: -fbfloat16-excess-precision=fast -target-feature +fullbf16 \
+// RUN: -emit-llvm -o - %s | FileCheck -check-prefixes=CHECK-NO-EXT %s
----------------
pengfei wrote:
> The tests here make me guess you want to use `fullbf16` the same as `HasLegalHalfType`.
Yes that is correct it is just to emulate the correct IR gen if x86 were to have native support. Happy to remove these tests if you feel that is better?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150913/new/

https://reviews.llvm.org/D150913



More information about the cfe-commits mailing list