[llvm] [X86] SimplifyDemandedVectorEltsForTargetNode - don't split X86ISD::CVTTP2UI nodes without AVX512VL (PR #154504)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 20 03:14:45 PDT 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-x86
Author: Simon Pilgrim (RKSimon)
<details>
<summary>Changes</summary>
Unlike CVTTP2SI, CVTTP2UI is only available on AVX512 targets, so we don't fallback to the AVX1 variant when we split a 512-bit vector, so we can only use the 128/256-bit variants if we have AVX512VL.
Fixes #<!-- -->154492
---
Full diff: https://github.com/llvm/llvm-project/pull/154504.diff
2 Files Affected:
- (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+5-1)
- (added) llvm/test/CodeGen/X86/pr154492.ll (+20)
``````````diff
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 8c3380b0c61da..2c726a9f7f6c9 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -44195,8 +44195,12 @@ bool X86TargetLowering::SimplifyDemandedVectorEltsForTargetNode(
}
// Conversions.
// TODO: Add more CVT opcodes when we have test coverage.
- case X86ISD::CVTTP2SI:
case X86ISD::CVTTP2UI: {
+ if (!Subtarget.hasVLX())
+ break;
+ [[fallthrough]];
+ }
+ case X86ISD::CVTTP2SI: {
if (Op.getOperand(0).getValueType().getVectorElementType() == MVT::f16 &&
!Subtarget.hasVLX())
break;
diff --git a/llvm/test/CodeGen/X86/pr154492.ll b/llvm/test/CodeGen/X86/pr154492.ll
new file mode 100644
index 0000000000000..1ba17594976e1
--- /dev/null
+++ b/llvm/test/CodeGen/X86/pr154492.ll
@@ -0,0 +1,20 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
+; RUN: llc < %s -mtriple=x86_64-- -mattr=+avx512f | FileCheck %s --check-prefix=AVX512F
+; RUN: llc < %s -mtriple=x86_64-- -mattr=+avx512vl | FileCheck %s --check-prefix=AVX512VL
+
+define <16 x i32> @PR154492() {
+; AVX512F-LABEL: PR154492:
+; AVX512F: # %bb.0:
+; AVX512F-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; AVX512F-NEXT: vcvttps2udq %zmm0, %zmm0
+; AVX512F-NEXT: vmovaps %ymm0, %ymm0
+; AVX512F-NEXT: retq
+;
+; AVX512VL-LABEL: PR154492:
+; AVX512VL: # %bb.0:
+; AVX512VL-NEXT: vxorps %xmm0, %xmm0, %xmm0
+; AVX512VL-NEXT: vcvttps2udq %ymm0, %ymm0
+; AVX512VL-NEXT: retq
+ %res = call <16 x i32> @llvm.x86.avx512.mask.cvttps2udq.512(<16 x float> zeroinitializer, <16 x i32> zeroinitializer, i16 255, i32 4)
+ ret <16 x i32> %res
+}
``````````
</details>
https://github.com/llvm/llvm-project/pull/154504
More information about the llvm-commits
mailing list