[llvm] 23bc5b6 - [AArch64] Mark bf16 as custom for truncating stores & add a comment

Sun Mar 3 22:33:46 PST 2024

Author: David Majnemer
Date: 2024-03-04T06:33:25Z
New Revision: 23bc5b6392c61980d24cc23a61a020f7aaf858bf

URL: https://github.com/llvm/llvm-project/commit/23bc5b6392c61980d24cc23a61a020f7aaf858bf
DIFF: https://github.com/llvm/llvm-project/commit/23bc5b6392c61980d24cc23a61a020f7aaf858bf.diff

LOG: [AArch64] Mark bf16 as custom for truncating stores & add a comment

While we don't use SVE2 as a fallback for missing NEON instructions for
BF16, it is confusing to break symmetry with fp16.

While we are here, add a comment explaining how BF16 immediates work.

Added: 
    

Modified: 
    llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 475c73c3588dbc..5af89bc5192a1b 100644

--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -1571,7 +1571,12 @@ AArch64TargetLowering::AArch64TargetLowering(const TargetMachine &TM,
         setOperationAction(ISD::VECREDUCE_SEQ_FADD, VT, Custom);
     }
 
-    if (!Subtarget->isNeonAvailable()) {// TODO(majnemer)
+    if (!Subtarget->isNeonAvailable()) {
+      setTruncStoreAction(MVT::v2f32, MVT::v2bf16, Custom);
+      setTruncStoreAction(MVT::v4f32, MVT::v4bf16, Custom);
+      setTruncStoreAction(MVT::v8f32, MVT::v8bf16, Custom);
+      setTruncStoreAction(MVT::v2f64, MVT::v2bf16, Custom);
+      setTruncStoreAction(MVT::v4f64, MVT::v4bf16, Custom);
       setTruncStoreAction(MVT::v2f32, MVT::v2f16, Custom);
       setTruncStoreAction(MVT::v4f32, MVT::v4f16, Custom);
       setTruncStoreAction(MVT::v8f32, MVT::v8f16, Custom);
@@ -10385,13 +10390,17 @@ bool AArch64TargetLowering::isFPImmLegal(const APFloat &Imm, EVT VT,
   bool IsLegal = false;
   // We can materialize #0.0 as fmov $Rd, XZR for 64-bit, 32-bit cases, and
   // 16-bit case when target has full fp16 support.
+  // We encode bf16 bit patterns as if they were fp16. This results in very
+  // strange looking assembly but should populate the register with appropriate
+  // values. Let's say we wanted to encode 0xR3FC0 which is 1.5 in BF16. We will
+  // end up encoding this as the imm8 0x7f. This imm8 will be expanded to the
+  // FP16 1.9375 which shares the same bit pattern as BF16 1.5.
   // FIXME: We should be able to handle f128 as well with a clever lowering.
   const APInt ImmInt = Imm.bitcastToAPInt();
   if (VT == MVT::f64)
     IsLegal = AArch64_AM::getFP64Imm(ImmInt) != -1 || Imm.isPosZero();
   else if (VT == MVT::f32)
     IsLegal = AArch64_AM::getFP32Imm(ImmInt) != -1 || Imm.isPosZero();
-  // TODO(majnemer): double check this...
   else if (VT == MVT::f16 || VT == MVT::bf16)
     IsLegal =
         (Subtarget->hasFullFP16() && AArch64_AM::getFP16Imm(ImmInt) != -1) ||