[llvm] [VPlan] Simplify VPInstruction::Selects and'ed with header mask with EVL (PR #147243)

Luke Lau via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 7 03:23:35 PDT 2025


lukel97 wrote:

cc @artagnon just for context, one such case this fixes that we're seeing in SPEC CPU 2017 is this from 525.x264_r:

```c
#include <stdint.h>

#define QUANT_ONE( coef, mf, f ) \
{ \
    if( (coef) > 0 ) \
        (coef) = (f + (coef)) * (mf) >> 16; \
    else \
        (coef) = - ((f - (coef)) * (mf) >> 16); \
    nz |= (coef); \
}


int quant_4x4( int16_t dct[16], uint16_t mf[16], uint16_t bias[16] )
{
    int nz = 0;
    for( int i = 0; i < 16; i++ )
        QUANT_ONE( dct[i], mf[i], bias[i] );
    return !!nz;
}
 

```

Without this patch, we see a redundant header mask kept around and `vmand.mm`'d into some other 'useful' masks:
```asm
.LBB0_3:                                # %vector.body
        // ...
	vadd.vx	v16, v8, a3 # HEADER MASK
	vmsleu.vi	v23, v16, 15
        // ...
	vsetvli	zero, zero, e16, m1, ta, ma
	vmsgt.vi	v24, v21, 0 # OTHER MASK
	// ...
	vmand.mm	v0, v23, v24 # REDUNDANT AND BECAUSE EVL ALREADY SET
	vnsrl.wi	v16, v14, 16, v0.t
        // ...
```

With this patch it gets removed:

```asm
.LBB0_3:                                # %vector.body
        // ...
	vmsgt.vi	v0, v16, 0 # OTHER MASK USED DIRECTLY
	vnsrl.wi	v12, v10, 16, v0.t
```

https://github.com/llvm/llvm-project/pull/147243


More information about the llvm-commits mailing list