[PATCH] D159267: [AArch64] Remove copy instruction between uaddlv and dup

Fri Sep 1 02:53:37 PDT 2023

jaykang10 added a comment.

In D159267#4631879 <https://reviews.llvm.org/D159267#4631879>, @efriedma wrote:

> Ideally, we'd lower the intrinsic to some operation that returns its result in a vector register.  Given limitations of SelectionDAG, that means introducing an opcode that produces a <2 x i32> or something like that.  So instead of "(AArch64dup (int_aarch64_neon_uaddlv))", we'd end up with something more like "(AArch64dup (extract_element(AArch64uaddlv))", and existing patterns would naturally do the right thing.
>
> Otherwise, I think we end up needing way too many patterns to cover every operation that could possibly use the result of a uaddlv in a vector register.

Thanks for your kind comment.
Even if we add a custom SDNode with vector type result for uaddlv, we would need copy instruction for different register classes because the AArch64dup is scalar one which has scalar input. We would need to change the scalar dup to the vector dup as well as uaddlv. That is the reason why I added the pattern...
I am not sure how we can generalize to change the uaddlv and its use instruction to vector one...

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159267/new/

https://reviews.llvm.org/D159267