[llvm] [VectorCombine] Add Cmp and Select for shuffleToIdentity (PR #92794)

David Green via llvm-commits llvm-commits at lists.llvm.org
Thu May 30 10:20:23 PDT 2024


davemgreen wrote:

Thanks, it looks like these, if there is no other optimizations going on: https://godbolt.org/z/MfETGqdse
Good:
```
.LBB0_2:                                # %"2_for_op_pblend_b_0.s0.x.x"
        vmovdqa xmm1, xmmword ptr [r14 + rdx - 16]
        vmovdqa xmm2, xmmword ptr [r14 + rdx]
        vmovdqa xmm3, xmmword ptr [r14 + rdx + 16]
        vpmaxub xmm4, xmm1, xmm0
        vpcmpeqb        xmm4, xmm1, xmm4
        vpmaxub xmm5, xmm2, xmm0
        vpcmpeqb        xmm5, xmm2, xmm5
        vpblendvb       xmm1, xmm2, xmm1, xmm4
        vpblendvb       xmm2, xmm3, xmm2, xmm5
        vmovdqa xmmword ptr [rcx + rdx + 16], xmm2
        vmovdqa xmmword ptr [rcx + rdx], xmm1
        add     rdx, 32
        cmp     rdx, 768
        jne     .LBB0_2
```

Bad:
```
.LBB0_2:                                # %"2_for_op_pblend_b_0.s0.x.x"
        vmovdqa xmm1, xmmword ptr [r14 + rdx - 16]
        vmovdqa xmm2, xmmword ptr [r14 + rdx]
        vinsertf128     ymm3, ymm1, xmm2, 1
        vpmaxub xmm4, xmm2, xmm0
        vpcmpeqb        xmm4, xmm2, xmm4
        vpmaxub xmm5, xmm1, xmm0
        vpcmpeqb        xmm1, xmm1, xmm5
        vinsertf128     ymm1, ymm1, xmm4, 1
        vinsertf128     ymm2, ymm2, xmmword ptr [r14 + rdx + 16], 1
        vandnps ymm2, ymm1, ymm2
        vandps  ymm1, ymm3, ymm1
        vorps   ymm1, ymm1, ymm2
        vmovaps ymmword ptr [rcx + rdx], ymm1
        add     rdx, 32
        cmp     rdx, 768
        jne     .LBB0_2
```

https://github.com/llvm/llvm-project/pull/92794


More information about the llvm-commits mailing list