[PATCH] D105119: [SVE] Fix incorrect codegen when inserting vector elements into widened scalable vectors

Thu Jul 1 13:35:05 PDT 2021

efriedma added a comment.

> For <vscale x 4 x i16> (assuming a native vector width of vscale x 128 bits for this example):

There are two independent questions to consider: one, what layout we're using, and two, how we achieve that layout.

For `<vscale x 4 x i16>`, we use TypePromoteInteger in legalization, to convert it to `<vscale x 4 x i32>`. TypePromoteInteger has the same meaning whether or not a vector is scalable.

Say we have `<vscale x 1 x i64>`.  You're saying we need to ensure this has a layout where we alternate between legal and illegal elements.  Possible ways to achieve that layout:

1. Make `<vscale x 1 x i64>` a legal type, and use some combination of operation legalization/isel patterns to ensure the alternation.
2. Use TypePromoteInteger to `<vscale x 1 x i128>`.  But then we need to figure out how to legalize `<vscale x 1 x i128>`, and we'd need a different solution for floating-point types.
3. Define TypeWidenVector to appends undef elements to the end, but convert between the two layouts where the difference is externally visible (calling convention code).
4. Define TypeWidenVector to do some sort of interleaving.  I guess widening `<vscale x N x i64>` ->  `<vscale x M x i64>`, we insert M-N undef elements every N elements?
5. Come up with a new LegalizeTypeAction that represents what we want to do here.

This patch is proposing (4), I think?  Have we made any other changes that imply (4)?  Have the other possibilities been discussed anywhere?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105119/new/

https://reviews.llvm.org/D105119