[llvm] [RISCV] Combine (mul (zext, zext)) -> (zext (mul (zext, zext))) (PR #86465)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Tue Mar 26 08:27:24 PDT 2024
lukel97 wrote:
A reduced example of the interference with vwmacc looks like this:
```diff
- vsetvli zero, zero, e32, m1, ta, ma
- vzext.vf4 v17, v11
- vzext.vf4 v16, v18
- vwmaccu.vv v14, v16, v17
+ vsetvli zero, zero, e8, mf4, ta, ma
+ vwmulu.vv v17, v11, v18
+ vsetvli zero, zero, e32, m1, ta, ma
+ vzext.vf2 v16, v17
+ vwaddu.wv v14, v14, v16
```
So although we lose the vwmacc, we trade it for a vzext for a vwmulu.vv at a smaller EMUL. So at larger LMULs this may still be cheaper.
It's worth noting that the widening combines today also affect our ability to select vwmacc, e.g.:
```llvm
define <vscale x 4 x i64> @f(<vscale x 4 x i8> %a, <vscale x 4 x i8> %b, <vscale x 4 x i8> %c) {
%a.zext = zext <vscale x 4 x i8> %a to <vscale x 4 x i64>
%b.zext = zext <vscale x 4 x i8> %b to <vscale x 4 x i64>
%c.zext = zext <vscale x 4 x i8> %c to <vscale x 4 x i64>
%mul = mul <vscale x 4 x i64> %a.zext, %b.zext
%macc = add <vscale x 4 x i64> %mul, %c.zext
ret <vscale x 4 x i64> %macc
}
```
With combineBinOp_VLToVWBinOp_VL we get
```asm
f:
vsetvli a0, zero, e32, m2, ta, ma
vzext.vf4 v12, v8
vzext.vf4 v14, v9
vzext.vf4 v16, v10
vwmulu.vv v8, v12, v14
vwaddu.wv v8, v8, v16
```
Without combineBinOp_VLToVWBinOp_VL
```asm
f:
vsetvli a0, zero, e64, m4, ta, ma
vzext.vf8 v16, v8
vzext.vf8 v20, v9
vzext.vf8 v12, v10
vmacc.vv v12, v16, v20
vmv.v.v v8, v12
```
So probably worth looking into as a separate issue. I think we should be able to do this as
```asm
f:
vsetvli a0, zero, e64, m1, ta, ma
vzext.vf2 v12, v10
vsetvli a0, zero, e64, mf2, ta, ma
vwmaccu.vv v12, v8, v9
vsetvli a0, zero, e64, m2, ta, ma
vzext.vf2 v8, v12
```
https://github.com/llvm/llvm-project/pull/86465
More information about the llvm-commits
mailing list