[llvm] [X86] IsElementEquivalent - add handling for ISD::BITCASTS from smaller vector elements (PR #139741)

Simon Pilgrim via llvm-commits llvm-commits at lists.llvm.org
Mon May 19 00:07:01 PDT 2025


================
@@ -487,10 +487,10 @@ define void @store_i32_stride7_vf4(ptr %in.vecptr0, ptr %in.vecptr1, ptr %in.vec
 ; AVX-NEXT:    vunpckhpd {{.*#+}} xmm11 = xmm2[1],xmm1[1]
 ; AVX-NEXT:    vshufps {{.*#+}} ymm9 = ymm9[2,1],ymm11[2,0],ymm9[6,5],ymm11[6,4]
 ; AVX-NEXT:    vblendps {{.*#+}} ymm9 = ymm10[0,1],ymm9[2,3,4],ymm10[5,6,7]
-; AVX-NEXT:    vinsertf128 $1, %xmm1, %ymm0, %ymm10
+; AVX-NEXT:    vinsertf128 $1, %xmm1, %ymm2, %ymm10
 ; AVX-NEXT:    vunpcklps {{.*#+}} ymm8 = ymm10[0],ymm8[0],ymm10[1],ymm8[1],ymm10[4],ymm8[4],ymm10[5],ymm8[5]
 ; AVX-NEXT:    vbroadcastss (%r10), %ymm10
-; AVX-NEXT:    vblendps {{.*#+}} ymm8 = ymm8[0,1,2,3,4,5],ymm10[6,7]
+; AVX-NEXT:    vunpcklpd {{.*#+}} ymm8 = ymm8[0],ymm10[0],ymm8[2],ymm10[2]
----------------
RKSimon wrote:

On some Intel targets it can be, but the general rule has been to prefer simpler shuffle patterns whenever possible. But I'll see what options we have as we do lose commutation which can be useful for load folding.

https://github.com/llvm/llvm-project/pull/139741


More information about the llvm-commits mailing list