[llvm] [X86] matchUnaryShuffle - add support for matching 512-bit extension patterns. (PR #127643)

Phoebe Wang via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 18 07:06:15 PST 2025


================
@@ -110,8 +109,7 @@ define void @mask_replication_factor2_vf8(ptr %in.maskvec, ptr %in.vec, ptr %out
 ; AVX512DQ:       # %bb.0:
 ; AVX512DQ-NEXT:    kmovb (%rdi), %k0
 ; AVX512DQ-NEXT:    vpmovm2d %k0, %zmm0
-; AVX512DQ-NEXT:    vpmovsxbd {{.*#+}} zmm1 = [0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7]
-; AVX512DQ-NEXT:    vpermd %zmm0, %zmm1, %zmm0
+; AVX512DQ-NEXT:    vpmovsxdq %ymm0, %zmm0
----------------
phoebewang wrote:

I think the ideal instructions would be:
```
kmovb (%rdi), %k1
vmovdqa64 (%rsi), %zmm0 {%k1} {z}
vmovdqa64 %zmm0, (%rdx)
```
For replication factor on masks, we should just factor on element width for efficiency.

https://github.com/llvm/llvm-project/pull/127643


More information about the llvm-commits mailing list