[PATCH] D99324: [AArch64][SVE] Simplify codegen of svdup_lane intrinsic
JunMa via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 25 04:10:25 PDT 2021
junparser added a comment.
In D99324#2650064 <https://reviews.llvm.org/D99324#2650064>, @paulwalker-arm wrote:
> I'm not saying all the pieces will come for free but this feels like an intrinsic optimisation problem rather than an instruction selection one. What about extending SVEIntrinsicOpts.cpp to convert the pattern to a stock `splat_vector(extract_vector_elt(vec, idx))` and then letting the code generator decide how best to lower the LLVM way of doing things. This'll mean we solve the problem once for ACLE and auto-vectorisation.
Actually, it is an isel issue, The svdup_lane in title is just where I find this issue.
1), there is no intrinsic direct map to dup (index) instruction, while vector_extract may lower with dup (index), it is not enough. 2) svdup_lane acle intrinsic generates as sve.dup.x + sve.tbl in llvm ir, and covert to AArch64tbl ( ... splat_vector(..., constant)) , then lower to AArch64tbl ( ... DUP(..., imm)). This is the pattern this patch try to match.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99324/new/
https://reviews.llvm.org/D99324
More information about the llvm-commits
mailing list