[llvm] [VPlan] Simplify VPInstruction::Selects and'ed with header mask with EVL (PR #147243)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 7 03:23:35 PDT 2025
lukel97 wrote:
cc @artagnon just for context, one such case this fixes that we're seeing in SPEC CPU 2017 is this from 525.x264_r:
```c
#include <stdint.h>
#define QUANT_ONE( coef, mf, f ) \
{ \
if( (coef) > 0 ) \
(coef) = (f + (coef)) * (mf) >> 16; \
else \
(coef) = - ((f - (coef)) * (mf) >> 16); \
nz |= (coef); \
}
int quant_4x4( int16_t dct[16], uint16_t mf[16], uint16_t bias[16] )
{
int nz = 0;
for( int i = 0; i < 16; i++ )
QUANT_ONE( dct[i], mf[i], bias[i] );
return !!nz;
}
```
Without this patch, we see a redundant header mask kept around and `vmand.mm`'d into some other 'useful' masks:
```asm
.LBB0_3: # %vector.body
// ...
vadd.vx v16, v8, a3 # HEADER MASK
vmsleu.vi v23, v16, 15
// ...
vsetvli zero, zero, e16, m1, ta, ma
vmsgt.vi v24, v21, 0 # OTHER MASK
// ...
vmand.mm v0, v23, v24 # REDUNDANT AND BECAUSE EVL ALREADY SET
vnsrl.wi v16, v14, 16, v0.t
// ...
```
With this patch it gets removed:
```asm
.LBB0_3: # %vector.body
// ...
vmsgt.vi v0, v16, 0 # OTHER MASK USED DIRECTLY
vnsrl.wi v12, v10, 16, v0.t
```
https://github.com/llvm/llvm-project/pull/147243
More information about the llvm-commits
mailing list