[PATCH] D105119: [SVE] Fix incorrect codegen when inserting vector elements into widened scalable vectors
Eli Friedman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 12 12:51:17 PDT 2021
efriedma added a comment.
In D105119#2871256 <https://reviews.llvm.org/D105119#2871256>, @sdesmalen wrote:
> Hi @efriedma, I had to give this a moment to sink in, but you are right about the widening not having to assume anything about the target's vector format, I had my wires crossed here.
>
> The widening operation done here is purely performed on a conceptual vector. Widening <vscale x 1 x i64> to <vscale x 2 x i64> by appending undef elements at the end of the vector, is actually the operation that's performed by INSERT_SUBVECTOR (and vice-versa for EXTRACT_SUBVECTOR), which we then implement efficiently with UZP1 and UUNPKLO if the (min) element count is a power of two.
Currently, all the target-independent code is assuming FMT2. This includes lowering for INSERT_ELEMENT, INSERT_SUBVECTOR, EXTRACT_SUBVECTOR, etc. And I don't think we have any target-specific code that triggers for `<vscale x 1 x i64>`. (Due to a bug/quirk in the type legalization code, we don't end up using our code for AArch64TargetLowering::LowerINSERT_SUBVECTOR. I'll look at this.)
Like you mention, it's pretty efficient to flip between the formats if we need to.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D105119/new/
https://reviews.llvm.org/D105119
More information about the llvm-commits
mailing list