[PATCH] D105119: [SVE] Fix incorrect codegen when inserting vector elements into widened scalable vectors

Mon Jul 12 12:51:17 PDT 2021

efriedma added a comment.

In D105119#2871256 <https://reviews.llvm.org/D105119#2871256>, @sdesmalen wrote:

> Hi @efriedma, I had to give this a moment to sink in, but you are right about the widening not having to assume anything about the target's vector format, I had my wires crossed here.
>
> The widening operation done here is purely performed on a conceptual vector. Widening <vscale x 1 x i64> to <vscale x 2 x i64> by appending undef elements at the end of the vector, is actually the operation that's performed by INSERT_SUBVECTOR (and vice-versa for EXTRACT_SUBVECTOR), which we then implement efficiently with UZP1 and UUNPKLO if the (min) element count is a power of two.

Currently, all the target-independent code is assuming FMT2.  This includes lowering for INSERT_ELEMENT, INSERT_SUBVECTOR, EXTRACT_SUBVECTOR, etc.  And I don't think we have any target-specific code that triggers for `<vscale x 1 x i64>`.  (Due to a bug/quirk in the type legalization code, we don't end up using our code for AArch64TargetLowering::LowerINSERT_SUBVECTOR.  I'll look at this.)

Like you mention, it's pretty efficient to flip between the formats if we need to.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105119/new/

https://reviews.llvm.org/D105119