[llvm] r365041 - [X86][AVX] combineX86ShuffleChainWithExtract - add number of non-zero extract_subvectors to the combine depth
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 3 07:17:21 PDT 2019
Author: rksimon
Date: Wed Jul 3 07:17:21 2019
New Revision: 365041
URL: http://llvm.org/viewvc/llvm-project?rev=365041&view=rev
Log:
[X86][AVX] combineX86ShuffleChainWithExtract - add number of non-zero extract_subvectors to the combine depth
This better accounts for the cost/benefit of removing extract_subvectors from the shuffle and will be more useful in future patches.
The vpermq predicate regression will be fixed shortly.
Modified:
llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
llvm/trunk/test/CodeGen/X86/avx512-shuffles/partial_permute.ll
Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=365041&r1=365040&r2=365041&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
+++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Wed Jul 3 07:17:21 2019
@@ -32429,6 +32429,9 @@ static SDValue combineX86ShuffleChainWit
if (WideInputs.size() > 2)
return SDValue();
+ // Increase depth for every upper subvector we've peeked through.
+ Depth += count_if(Offsets, [](unsigned Offset) { return Offset > 0; });
+
// Attempt to combine wider chain.
// TODO: Can we use a better Root?
SDValue WideRoot = WideInputs[0];
Modified: llvm/trunk/test/CodeGen/X86/avx512-shuffles/partial_permute.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/avx512-shuffles/partial_permute.ll?rev=365041&r1=365040&r2=365041&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/avx512-shuffles/partial_permute.ll (original)
+++ llvm/trunk/test/CodeGen/X86/avx512-shuffles/partial_permute.ll Wed Jul 3 07:17:21 2019
@@ -2216,9 +2216,9 @@ define <2 x i64> @test_masked_8xi64_to_2
define <2 x i64> @test_masked_z_8xi64_to_2xi64_perm_mask0(<8 x i64> %vec, <2 x i64> %mask) {
; CHECK-LABEL: test_masked_z_8xi64_to_2xi64_perm_mask0:
; CHECK: # %bb.0:
+; CHECK-NEXT: vpermq {{.*#+}} zmm0 = zmm0[3,0,2,3,7,4,6,7]
; CHECK-NEXT: vptestnmq %xmm1, %xmm1, %k1
-; CHECK-NEXT: vpermq {{.*#+}} zmm0 {%k1} {z} = zmm0[3,0,2,3,7,4,6,7]
-; CHECK-NEXT: # kill: def $xmm0 killed $xmm0 killed $zmm0
+; CHECK-NEXT: vmovdqa64 %xmm0, %xmm0 {%k1} {z}
; CHECK-NEXT: vzeroupper
; CHECK-NEXT: retq
%shuf = shufflevector <8 x i64> %vec, <8 x i64> undef, <2 x i32> <i32 3, i32 0>
More information about the llvm-commits
mailing list