[llvm] [X86] Explicitly widen larger than v4f16 to the legal v8f16 (NFC) (PR #153839)

Sun Aug 17 11:40:13 PDT 2025

https://github.com/anemet updated https://github.com/llvm/llvm-project/pull/153839

>From 355d1fc0034894d0b94fc2c23678f0fc2f42e282 Mon Sep 17 00:00:00 2001
From: Adam Nemet <anemet at apple.com>
Date: Wed, 13 Aug 2025 22:59:49 -0700
Subject: [PATCH] [X86] Explicitly widen larger than v4f16 to the legal v8f16
 (NFC)

This patch makes the current behavior explicit to prepare for adding
VTs for v[567]f16.

Right now these types are EVTs and hence don't fall under
getPreferredVectorAction and are simply widened to the next legal
power-of-two vector type.  For SSE2 this is v8f16.

Without the preparatory patch however, the behavior would change
after adding these types.  getPreferredVectorAction would try to
split them because this is the current behavior for any f16 vector
type that is not legal.

There is a lot more detail at
https://github.com/llvm/llvm-project/issues/152150 in particular how
splitting these new types leads to an inconsistency between
NumRegistersForVT and getTypeAction.

The patch ensures that after the new types are added they would
continue to be widened rather than split.  Once the patch to enable
v[567]f16 lands, it will be an NFC for x86.
---
 llvm/lib/Target/X86/X86ISelLowering.cpp |  4 +++-
 llvm/test/CodeGen/X86/pr152150.ll       | 14 ++++++++++++++
 2 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/X86/pr152150.ll

diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 7a816de53dbd3..52e0bb8a9b83d 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -2756,8 +2756,10 @@ X86TargetLowering::getPreferredVectorAction(MVT VT) const {
       !Subtarget.hasBWI())
     return TypeSplitVector;
 
+  // Since v8f16 is legal, widen anything over v4f16.
   if (!VT.isScalableVector() && VT.getVectorNumElements() != 1 &&
-      !Subtarget.hasF16C() && VT.getVectorElementType() == MVT::f16)
+      VT.getVectorNumElements() <= 4 && !Subtarget.hasF16C() &&
+      VT.getVectorElementType() == MVT::f16)
     return TypeSplitVector;
 
   if (!VT.isScalableVector() && VT.getVectorNumElements() != 1 &&
diff --git a/llvm/test/CodeGen/X86/pr152150.ll b/llvm/test/CodeGen/X86/pr152150.ll
new file mode 100644
index 0000000000000..6db3e555028cc
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr152150.ll
@@ -0,0 +1,14 @@
+; RUN: llc < %s -mtriple=x86_64-unknown-unknown-eabi-elf | FileCheck %s
+
+; CHECK-LABEL: conv2d
+define dso_local void @conv2d() {
+.preheader:
+  br label %0
+
+0:                                                ; preds = %0, %.preheader
+  %1 = phi [4 x <7 x half>] [ zeroinitializer, %.preheader ], [ %4, %0 ]
+  %2 = extractvalue [4 x <7 x half>] %1, 0
+  %3 = extractvalue [4 x <7 x half>] %1, 1
+  %4 = insertvalue [4 x <7 x half>] poison, <7 x half> poison, 3
+  br label %0
+}