[llvm-branch-commits] [llvm] release/21.x: [X86] SimplifyDemandedVectorEltsForTargetNode - don't split X86ISD::CVTTP2UI nodes without AVX512VL (#154504) (PR #154526)

via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Wed Aug 20 05:47:34 PDT 2025


https://github.com/llvmbot created https://github.com/llvm/llvm-project/pull/154526

Backport d770567a514716cdb250a2dee635435c22622e34

Requested by: @nikic

>From 0cf566fd6434fcd52a36ded92b4bfdcde6b9681d Mon Sep 17 00:00:00 2001
From: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: Wed, 20 Aug 2025 12:18:10 +0100
Subject: [PATCH] [X86] SimplifyDemandedVectorEltsForTargetNode - don't split
 X86ISD::CVTTP2UI nodes without AVX512VL (#154504)

Unlike CVTTP2SI, CVTTP2UI is only available on AVX512 targets, so we
don't fallback to the AVX1 variant when we split a 512-bit vector, so we
can only use the 128/256-bit variants if we have AVX512VL.

Fixes #154492

(cherry picked from commit d770567a514716cdb250a2dee635435c22622e34)
---
 llvm/lib/Target/X86/X86ISelLowering.cpp |  6 +++++-
 llvm/test/CodeGen/X86/pr154492.ll       | 20 ++++++++++++++++++++
 2 files changed, 25 insertions(+), 1 deletion(-)
 create mode 100644 llvm/test/CodeGen/X86/pr154492.ll

diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index c7839baf7de8e..85e5ebc385c68 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -44178,8 +44178,12 @@ bool X86TargetLowering::SimplifyDemandedVectorEltsForTargetNode(
     }
       // Conversions.
       // TODO: Add more CVT opcodes when we have test coverage.
-    case X86ISD::CVTTP2SI:
     case X86ISD::CVTTP2UI: {
+      if (!Subtarget.hasVLX())
+        break;
+      [[fallthrough]];
+    }
+    case X86ISD::CVTTP2SI: {
       if (Op.getOperand(0).getValueType().getVectorElementType() == MVT::f16 &&
           !Subtarget.hasVLX())
         break;
diff --git a/llvm/test/CodeGen/X86/pr154492.ll b/llvm/test/CodeGen/X86/pr154492.ll
new file mode 100644
index 0000000000000..1ba17594976e1
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr154492.ll
@@ -0,0 +1,20 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s -mtriple=x86_64-- -mattr=+avx512f  | FileCheck %s --check-prefix=AVX512F
+; RUN: llc < %s -mtriple=x86_64-- -mattr=+avx512vl | FileCheck %s --check-prefix=AVX512VL
+
+define <16 x i32> @PR154492() {
+; AVX512F-LABEL: PR154492:
+; AVX512F:       # %bb.0:
+; AVX512F-NEXT:    vxorps %xmm0, %xmm0, %xmm0
+; AVX512F-NEXT:    vcvttps2udq %zmm0, %zmm0
+; AVX512F-NEXT:    vmovaps %ymm0, %ymm0
+; AVX512F-NEXT:    retq
+;
+; AVX512VL-LABEL: PR154492:
+; AVX512VL:       # %bb.0:
+; AVX512VL-NEXT:    vxorps %xmm0, %xmm0, %xmm0
+; AVX512VL-NEXT:    vcvttps2udq %ymm0, %ymm0
+; AVX512VL-NEXT:    retq
+  %res = call <16 x i32> @llvm.x86.avx512.mask.cvttps2udq.512(<16 x float> zeroinitializer, <16 x i32> zeroinitializer, i16 255, i32 4)
+  ret <16 x i32> %res
+}



More information about the llvm-branch-commits mailing list