[llvm] [X86] SimplifyDemandedVectorEltsForTargetNode - reduce the size of VPERMV/VPERMV3 nodes if the upper elements are not demanded (PR #133923)
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 2 09:10:26 PDT 2025
================
@@ -1113,8 +1113,8 @@ define <16 x i8> @evenelts_v32i16_trunc_v16i16_to_v16i8(<32 x i16> %n2) nounwind
;
; AVX512VBMI-FAST-LABEL: evenelts_v32i16_trunc_v16i16_to_v16i8:
; AVX512VBMI-FAST: # %bb.0:
-; AVX512VBMI-FAST-NEXT: vmovdqa {{.*#+}} xmm1 = [0,4,8,12,16,20,24,28,32,36,40,44,48,52,56,79]
-; AVX512VBMI-FAST-NEXT: vpxor %xmm2, %xmm2, %xmm2
+; AVX512VBMI-FAST-NEXT: vmovdqa {{.*#+}} xmm1 = [64,65,66,67,68,69,24,28,32,36,40,44,48,52,56,79]
+; AVX512VBMI-FAST-NEXT: vpmovdb %ymm0, %xmm2
----------------
RKSimon wrote:
This change allows the VPMOVDB node to be created from another VPERMV3 node, so we no longer have 2 VPERMV3 nodes that we can easily fold together. We're still struggling to combine shuffles across different vector widths - it will be handled eventually after #133947 but that is a much larger WIP patch that is highly dependent on us getting this in first.......
https://github.com/llvm/llvm-project/pull/133923
More information about the llvm-commits
mailing list