[llvm] [VectorCombine] Add Cmp and Select for shuffleToIdentity (PR #92794)
David Green via llvm-commits
llvm-commits at lists.llvm.org
Thu May 30 10:20:23 PDT 2024
davemgreen wrote:
Thanks, it looks like these, if there is no other optimizations going on: https://godbolt.org/z/MfETGqdse
Good:
```
.LBB0_2: # %"2_for_op_pblend_b_0.s0.x.x"
vmovdqa xmm1, xmmword ptr [r14 + rdx - 16]
vmovdqa xmm2, xmmword ptr [r14 + rdx]
vmovdqa xmm3, xmmword ptr [r14 + rdx + 16]
vpmaxub xmm4, xmm1, xmm0
vpcmpeqb xmm4, xmm1, xmm4
vpmaxub xmm5, xmm2, xmm0
vpcmpeqb xmm5, xmm2, xmm5
vpblendvb xmm1, xmm2, xmm1, xmm4
vpblendvb xmm2, xmm3, xmm2, xmm5
vmovdqa xmmword ptr [rcx + rdx + 16], xmm2
vmovdqa xmmword ptr [rcx + rdx], xmm1
add rdx, 32
cmp rdx, 768
jne .LBB0_2
```
Bad:
```
.LBB0_2: # %"2_for_op_pblend_b_0.s0.x.x"
vmovdqa xmm1, xmmword ptr [r14 + rdx - 16]
vmovdqa xmm2, xmmword ptr [r14 + rdx]
vinsertf128 ymm3, ymm1, xmm2, 1
vpmaxub xmm4, xmm2, xmm0
vpcmpeqb xmm4, xmm2, xmm4
vpmaxub xmm5, xmm1, xmm0
vpcmpeqb xmm1, xmm1, xmm5
vinsertf128 ymm1, ymm1, xmm4, 1
vinsertf128 ymm2, ymm2, xmmword ptr [r14 + rdx + 16], 1
vandnps ymm2, ymm1, ymm2
vandps ymm1, ymm3, ymm1
vorps ymm1, ymm1, ymm2
vmovaps ymmword ptr [rcx + rdx], ymm1
add rdx, 32
cmp rdx, 768
jne .LBB0_2
```
https://github.com/llvm/llvm-project/pull/92794
More information about the llvm-commits
mailing list