[llvm] [X86] matchUnaryShuffle - add support for matching 512-bit extension patterns. (PR #127643)
Phoebe Wang via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 18 07:06:15 PST 2025
================
@@ -110,8 +109,7 @@ define void @mask_replication_factor2_vf8(ptr %in.maskvec, ptr %in.vec, ptr %out
; AVX512DQ: # %bb.0:
; AVX512DQ-NEXT: kmovb (%rdi), %k0
; AVX512DQ-NEXT: vpmovm2d %k0, %zmm0
-; AVX512DQ-NEXT: vpmovsxbd {{.*#+}} zmm1 = [0,0,1,1,2,2,3,3,4,4,5,5,6,6,7,7]
-; AVX512DQ-NEXT: vpermd %zmm0, %zmm1, %zmm0
+; AVX512DQ-NEXT: vpmovsxdq %ymm0, %zmm0
----------------
phoebewang wrote:
I think the ideal instructions would be:
```
kmovb (%rdi), %k1
vmovdqa64 (%rsi), %zmm0 {%k1} {z}
vmovdqa64 %zmm0, (%rdx)
```
For replication factor on masks, we should just factor on element width for efficiency.
https://github.com/llvm/llvm-project/pull/127643
More information about the llvm-commits
mailing list