[llvm] 74f69c4 - [X86] SimplifyDemandedVectorEltsForTargetNode - reduce the size of VPERMV v16f32/v16i32 nodes if the upper elements are not demanded (#134890)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 9 03:14:42 PDT 2025
Author: Simon Pilgrim
Date: 2025-04-09T11:14:38+01:00
New Revision: 74f69c49fed894ba26b6174783e4c650d50344c5
URL: https://github.com/llvm/llvm-project/commit/74f69c49fed894ba26b6174783e4c650d50344c5
DIFF: https://github.com/llvm/llvm-project/commit/74f69c49fed894ba26b6174783e4c650d50344c5.diff
LOG: [X86] SimplifyDemandedVectorEltsForTargetNode - reduce the size of VPERMV v16f32/v16i32 nodes if the upper elements are not demanded (#134890)
Missed in #133923 - even without AVX512VL, we can replace VPERMV v16f32/v16i32 nodes with the AVX2 v8f32/v8i32 equivalents.
Added:
Modified:
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/vector-shuffle-512-v16.ll
Removed:
################################################################################
diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 47ac1ee571269..908b81d896e34 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -43810,7 +43810,9 @@ bool X86TargetLowering::SimplifyDemandedVectorEltsForTargetNode(
case X86ISD::VPERMV: {
SmallVector<int, 16> Mask;
SmallVector<SDValue, 2> Ops;
- if ((VT.is256BitVector() || Subtarget.hasVLX()) &&
+ // We can always split v16i32/v16f32 AVX512 to v8i32/v8f32 AVX2 variants.
+ if ((VT.is256BitVector() || Subtarget.hasVLX() || VT == MVT::v16i32 ||
+ VT == MVT::v16f32) &&
getTargetShuffleMask(Op, /*AllowSentinelZero=*/false, Ops, Mask)) {
// For lane-crossing shuffles, only split in half in case we're still
// referencing higher elements.
diff --git a/llvm/test/CodeGen/X86/vector-shuffle-512-v16.ll b/llvm/test/CodeGen/X86/vector-shuffle-512-v16.ll
index b1efb416014b0..7df80ee9f175b 100644
--- a/llvm/test/CodeGen/X86/vector-shuffle-512-v16.ll
+++ b/llvm/test/CodeGen/X86/vector-shuffle-512-v16.ll
@@ -491,8 +491,8 @@ define <4 x float> @test_v16f32_0_1_3_6 (<16 x float> %v) {
; ALL-LABEL: test_v16f32_0_1_3_6:
; ALL: # %bb.0:
; ALL-NEXT: vpmovsxbd {{.*#+}} xmm1 = [0,1,3,6]
-; ALL-NEXT: vpermps %zmm0, %zmm1, %zmm0
-; ALL-NEXT: # kill: def $xmm0 killed $xmm0 killed $zmm0
+; ALL-NEXT: vpermps %ymm0, %ymm1, %ymm0
+; ALL-NEXT: # kill: def $xmm0 killed $xmm0 killed $ymm0
; ALL-NEXT: vzeroupper
; ALL-NEXT: retq
%res = shufflevector <16 x float> %v, <16 x float> poison, <4 x i32> <i32 0, i32 1, i32 3, i32 6>
More information about the llvm-commits
mailing list