[PATCH] D145488: [CodeGen][AArch64] Generate Pseudo instructions for integer MLA/MAD/MLS/MSB

Paul Walker via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 8 05:59:30 PST 2023


paulwalker-arm added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:407-408
+                            [(add node:$op1, (AArch64mul_p_oneuse node:$pred, node:$op2, node:$op3)),
+                             // add(a, select(mask, mul(b, c), splat(0))) -> mla(a, mask, b, c)
+                             (add node:$op1, (vselect node:$pred, (AArch64mul_p_oneuse (SVEAllActive), node:$op2, node:$op3), (SVEDup0)))]>;
 def AArch64mls_m1 : PatFrags<(ops node:$pred, node:$op1, node:$op2, node:$op3),
----------------
sushgokh wrote:
> paulwalker-arm wrote:
> > The `add(a, select(...` pattern has specific requirements for the result of inactive lanes that matches `AArch64mla_m1`.  Did you mean to move it into `AArch64mla_p`?
> The current semantics of add(a, select(...) )  is resulting in 
> mla za, p0/m, zn * zm
> 
> The patch, even after moving this pattern to AArch64mla_p, should generate the pseudo with same predicate semantics (i.e. p0/m), right? I assume that thats the meaning of FalseLanesUndef. Correct me if wrong.
> 
> If this understanding is incorrect,
> 1. is there any other way to specify multiple patterns with different inactive lane requirements to a single pseudo?
> 
> 2. One option is define pattern that explicitly maps add(a, select(...) ) to specific pseudo.
> 
> 3. Other option is let this pattern remain with AArch64mla_m1 rather than AArch64mla_p
> The current semantics of add(a, select(...) )  is resulting in 
> mla za, p0/m, zn * zm

The output is only valid because of how the operation has been register allocated.  Remember the point of this patch is that `MLA_ZPZZZ` can emit an `MLA` or `MAD` based on what the register allocator chooses to do.  You can see the bug the current implementation will introduce by running the following through llc:
````
define <vscale x 16 x i8> @good(<vscale x 16 x i8> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c, <vscale x 16 x i1> %mask) {
  %mul = mul nsw <vscale x 16 x i8> %b, %c
  %sel = select <vscale x 16 x i1> %mask, <vscale x 16 x i8> %mul, <vscale x 16 x i8> zeroinitializer
  %add = add <vscale x 16 x i8> %a, %sel
  ret <vscale x 16 x i8> %add
}

define <vscale x 16 x i8> @bad(<vscale x 16 x i8> %a, <vscale x 16 x i8> %b, <vscale x 16 x i8> %c, <vscale x 16 x i1> %mask) {
  %mul = mul nsw <vscale x 16 x i8> %a, %b
  %sel = select <vscale x 16 x i1> %mask, <vscale x 16 x i8> %mul, <vscale x 16 x i8> zeroinitializer
  %add = add <vscale x 16 x i8> %c, %sel
  ret <vscale x 16 x i8> %add
}

good:
	mla	z0.b, p0/m, z1.b, z2.b
	ret

bad:
	mad	z0.b, p0/m, z1.b, z2.b
	ret
```
Both functions expect the inactive lanes to return the original value of the addend.  However the second example will incorrectly set the result of inactive lanes to the original value of the multiplicand.  Essential both sets of IR are synonymous to the AArch64mla_m1 operation.

> The patch, even after moving this pattern to AArch64mla_p, should generate the pseudo with same predicate semantics (i.e. p0/m), right? I assume that thats the meaning of FalseLanesUndef. Correct me if wrong.

`FalseLanesUndef` means the inactive lanes have no defined value but as explained above the `add(a, select(...) )` pattern sets the inactive lanes to a defined value, namely `added + 0`.

> 
> If this understanding is incorrect,
> 1. is there any other way to specify multiple patterns with different inactive lane requirements to a single pseudo?
> 
> 2. One option is define pattern that explicitly maps add(a, select(...) ) to specific pseudo.
> 
> 3. Other option is let this pattern remain with AArch64mla_m1 rather than AArch64mla_p

Based on the above I think option 3 is the only correct answer.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D145488/new/

https://reviews.llvm.org/D145488



More information about the llvm-commits mailing list