[llvm] [AArch64] Add sve bf16 fpext and fpround costs. (PR #150485)

Fri Aug 1 05:56:52 PDT 2025

================
@@ -3516,11 +3532,22 @@ InstructionCost AArch64TTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst,
       {ISD::FP_EXTEND, MVT::nxv4f32, MVT::nxv4f16, 1},
       {ISD::FP_EXTEND, MVT::nxv8f32, MVT::nxv8f16, 2},
 
+      // Extend from nxvmbf16 to nxvmf32.
+      {ISD::FP_EXTEND, MVT::nxv2f32, MVT::nxv2bf16, 1}, // lsl
+      {ISD::FP_EXTEND, MVT::nxv4f32, MVT::nxv4bf16, 1}, // lsl
+      {ISD::FP_EXTEND, MVT::nxv8f32, MVT::nxv8bf16, 2}, // unpck+unpck+lsl+lsl
----------------
davemgreen wrote:

The jump from 1 to 4 feels a bit odd. I was thinking that we might be able to use zip1/zip2 with a zero vector to pull the bits into the right lanes. For the moment it uses 4 though, thanks.

https://github.com/llvm/llvm-project/pull/150485