[PATCH] D99324: [AArch64][SVE] Simplify codegen of svdup_lane intrinsic

Thu Mar 25 04:16:52 PDT 2021

paulwalker-arm added a comment.

In D99324#2650100 <https://reviews.llvm.org/D99324#2650100>, @junparser wrote:

> In D99324#2650064 <https://reviews.llvm.org/D99324#2650064>, @paulwalker-arm wrote:
>
>> I'm not saying all the pieces will come for free but this feels like an intrinsic optimisation problem rather than an instruction selection one.  What about extending SVEIntrinsicOpts.cpp to convert the pattern to a stock `splat_vector(extract_vector_elt(vec, idx))` and then letting the code generator decide how best to lower the LLVM way of doing things.  This'll mean we solve the problem once for ACLE and auto-vectorisation.
>
> Actually, it is an isel issue,  The svdup_lane in title is just where I find this issue.
> 1), there is no intrinsic direct map to dup (index) instruction, while vector_extract may lower with dup (index), it is not enough. 2) svdup_lane  acle intrinsic generates as  sve.dup.x + sve.tbl  in llvm ir, and covert to AArch64tbl ( ... splat_vector(..., constant)) , then lower to AArch64tbl ( ... DUP(..., imm)). This is the pattern this patch try to match.

Sure, I understand that.  But the problem of good code generation to duplicate a vector lane seems like a generic one and thus we can solve that first.  Then we can canonicalise ACLE related intrinsic patterns to stock LLVM IR and thus not require multiple solutions to the same problem.  In the future this will also have the benefit of allowing other stock LLVM transforms to kick in that would otherwise not understand the SVE specific intrinsics.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99324/new/

https://reviews.llvm.org/D99324