[PATCH] D80139: [CodeGen][SVE] CopyToReg: Split scalable EVTs that are not powers of 2
Eli Friedman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon May 18 16:17:27 PDT 2020
efriedma added inline comments.
================
Comment at: llvm/lib/CodeGen/TargetLoweringBase.cpp:955
+ if (VT.isScalableVector() && !isPowerOf2_32(EC.Min))
+ llvm_unreachable(
+ "Splitting or widening of non-power-of-2 MVTs is not implemented.");
----------------
sdesmalen wrote:
> The reason for not implementing this yet is because I don't have a way to test it.
I think the only way to trigger this is by adding more types to llvm/include/llvm/Support/MachineValueType.h; it's fine to ignore that for now.
================
Comment at: llvm/lib/CodeGen/TargetLoweringBase.cpp:1425
+ // Scalable vectors cannot be scalarized, so splitting is required.
+ // FIXME: Implement widening of scalable vectors.
+ if (VT.isScalableVector() && !isPowerOf2_32(EltCnt.Min)) {
----------------
sdesmalen wrote:
> note: Widening of scalable vectors requires a bit more thought.
>
> For example, I wasn't sure how a `<vscale x 3 x i8>` should be widened at this level. Is it widened to a `<vscale x 4 x i8>` and then promoted to a `<vscale x 4 x i32>`, or do we keep it simple and always insert it into an UNDEF of the first legal type (`<vscale x 16 x i8>` for SVE).
>
> It would also require changes to legalise e.g. `<vscale x 9 x i16>`, that needs both widening to `<vscale x 16 x i16>`, and then splitting to two `<vscale x 8 x i16>`.
>
> The case handled by this patch allows it to support the tuple types for the structured load/store instructions, such as svint32x3_t (implemented as `<vscale x 12 x i32>`).
>
For widening values that fit in a single register, I think we should do whatever getTypeToTransformTo() would do. If you look a few lines up, there's a call to getTypeAction(); that code is supposed to handle cases where the type would fit into a single vector register. It probably makes sense to enhance it to be a bit more comprehensive in cases where there's more than one legalization step involved.
If we have a value with a legal element type that has more elements than the legal vector type, I guess we have to widen, then split: so `<vscale x 9 x i16>` would end up using `<vscale x 8 x i16>` components.
================
Comment at: llvm/lib/CodeGen/TargetLoweringBase.cpp:1437
+ NumIntermediates = VT.getVectorElementCount().Min /
+ SubPartVT.getVectorElementCount().Min;
+ IntermediateVT = SubPartVT;
----------------
Is flooring division correct here?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D80139/new/
https://reviews.llvm.org/D80139
More information about the llvm-commits
mailing list