[llvm] [X86] SimplifyDemandedVectorEltsForTargetNode - reduce the size of VPERMV v16f32/v16i32 nodes if the upper elements are not demanded (PR #134890)
via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 8 10:32:13 PDT 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-x86
Author: Simon Pilgrim (RKSimon)
<details>
<summary>Changes</summary>
Missed in #<!-- -->133923 - even without AVX512VL, we can replace VPERMV v16f32/v16i32 nodes with the AVX2 v8f32/v8i32 equivalents.
---
Full diff: https://github.com/llvm/llvm-project/pull/134890.diff
2 Files Affected:
- (modified) llvm/lib/Target/X86/X86ISelLowering.cpp (+3-1)
- (modified) llvm/test/CodeGen/X86/vector-shuffle-512-v16.ll (+2-2)
``````````diff
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 47ac1ee571269..908b81d896e34 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -43810,7 +43810,9 @@ bool X86TargetLowering::SimplifyDemandedVectorEltsForTargetNode(
case X86ISD::VPERMV: {
SmallVector<int, 16> Mask;
SmallVector<SDValue, 2> Ops;
- if ((VT.is256BitVector() || Subtarget.hasVLX()) &&
+ // We can always split v16i32/v16f32 AVX512 to v8i32/v8f32 AVX2 variants.
+ if ((VT.is256BitVector() || Subtarget.hasVLX() || VT == MVT::v16i32 ||
+ VT == MVT::v16f32) &&
getTargetShuffleMask(Op, /*AllowSentinelZero=*/false, Ops, Mask)) {
// For lane-crossing shuffles, only split in half in case we're still
// referencing higher elements.
diff --git a/llvm/test/CodeGen/X86/vector-shuffle-512-v16.ll b/llvm/test/CodeGen/X86/vector-shuffle-512-v16.ll
index b1efb416014b0..7df80ee9f175b 100644
--- a/llvm/test/CodeGen/X86/vector-shuffle-512-v16.ll
+++ b/llvm/test/CodeGen/X86/vector-shuffle-512-v16.ll
@@ -491,8 +491,8 @@ define <4 x float> @test_v16f32_0_1_3_6 (<16 x float> %v) {
; ALL-LABEL: test_v16f32_0_1_3_6:
; ALL: # %bb.0:
; ALL-NEXT: vpmovsxbd {{.*#+}} xmm1 = [0,1,3,6]
-; ALL-NEXT: vpermps %zmm0, %zmm1, %zmm0
-; ALL-NEXT: # kill: def $xmm0 killed $xmm0 killed $zmm0
+; ALL-NEXT: vpermps %ymm0, %ymm1, %ymm0
+; ALL-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0
; ALL-NEXT: vzeroupper
; ALL-NEXT: retq
%res = shufflevector <16 x float> %v, <16 x float> poison, <4 x i32> <i32 0, i32 1, i32 3, i32 6>
``````````
</details>
https://github.com/llvm/llvm-project/pull/134890
More information about the llvm-commits
mailing list