[PATCH] D147619: [SVE] Add patterns to delete redundant sel instructions

Thu Apr 6 23:18:33 PDT 2023

lizhijin added a comment.

The aim is to combine svadd_x and svsel intrinsics.
As we can see gcc has patterns to combine sel and some other instructions, while clang can't.
gcc : https://godbolt.org/z/G4aYefG51
clang : https://godbolt.org/z/G4aYefG51

Firstly I want to implement it by DAG Combiner for most instructions (sel + svmul , sel + fmla and etc) , but I find some instructions don't have DAG opcode, like fmulx, fsubr, etc. Do you mean to implement it in the mid-end inst combine or DAG-Combine or machine-combiner ?

In D147619#4248541 <https://reviews.llvm.org/D147619#4248541>, @paulwalker-arm wrote:

> Hi @lizhijin, I've added comments relating to the correctness of the patch but I'm a little concerned about the relevance of the patch's intent.
>
> What real world use case are you targeting?  I ask because on first glance it seems like you care about instances where users simulate `svadd_m` like builtins via a combination of `svadd_x` and `svsel` builtins.  If this is the case then I think a cleaner implementation would be to handle this as an instcombine.  However, if you're using the `_u` intrinsics purely as a proxy, then I'm wondering what the real IR looks like.
>
> If you look at `AArch64fadd_m1` you'll see we already have some support for the style of merging operations we expect to get during auto-vectorisation and there's sure to be benefit in extending this to cover more of the binops as your patch does but the patterns themselves are slightly different (i.e. the binop typically takes an all active predicate).
>
> None of this necessary blocks this patch but I'm worried about unnecessarily polluting isel.

In D147619#4248541 <https://reviews.llvm.org/D147619#4248541>, @paulwalker-arm wrote:

> Hi @lizhijin, I've added comments relating to the correctness of the patch but I'm a little concerned about the relevance of the patch's intent.
>
> What real world use case are you targeting?  I ask because on first glance it seems like you care about instances where users simulate `svadd_m` like builtins via a combination of `svadd_x` and `svsel` builtins.  If this is the case then I think a cleaner implementation would be to handle this as an instcombine.  However, if you're using the `_u` intrinsics purely as a proxy, then I'm wondering what the real IR looks like.
>
> If you look at `AArch64fadd_m1` you'll see we already have some support for the style of merging operations we expect to get during auto-vectorisation and there's sure to be benefit in extending this to cover more of the binops as your patch does but the patterns themselves are slightly different (i.e. the binop typically takes an all active predicate).
>
> None of this necessary blocks this patch but I'm worried about unnecessarily polluting isel.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147619/new/

https://reviews.llvm.org/D147619