[PATCH] D80139: [CodeGen][SVE] CopyToReg: Split scalable EVTs that are not powers of 2

Eli Friedman via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon May 18 16:17:27 PDT 2020


efriedma added inline comments.


================
Comment at: llvm/lib/CodeGen/TargetLoweringBase.cpp:955
+  if (VT.isScalableVector() && !isPowerOf2_32(EC.Min))
+    llvm_unreachable(
+        "Splitting or widening of non-power-of-2 MVTs is not implemented.");
----------------
sdesmalen wrote:
> The reason for not implementing this yet is because I don't have a way to test it.
I think the only way to trigger this is by adding more types to llvm/include/llvm/Support/MachineValueType.h; it's fine to ignore that for now.


================
Comment at: llvm/lib/CodeGen/TargetLoweringBase.cpp:1425
+  // Scalable vectors cannot be scalarized, so splitting is required.
+  // FIXME: Implement widening of scalable vectors.
+  if (VT.isScalableVector() && !isPowerOf2_32(EltCnt.Min)) {
----------------
sdesmalen wrote:
> note: Widening of scalable vectors requires a bit more thought.
> 
> For example, I wasn't sure how a `<vscale x 3 x i8>` should be widened at this level. Is it widened to a `<vscale x 4 x i8>` and then promoted to a `<vscale x 4 x i32>`, or do we keep it simple and always insert it into an UNDEF of the first legal type (`<vscale x 16 x i8>` for SVE).
> 
> It would also require changes to legalise e.g. `<vscale x 9 x i16>`, that needs both widening to `<vscale x 16 x i16>`, and then splitting to two `<vscale x 8 x i16>`.
> 
> The case handled by this patch allows it to support the tuple types for the structured load/store instructions, such as svint32x3_t (implemented as `<vscale x 12 x i32>`).
> 
For widening values that fit in a single register, I think we should do whatever getTypeToTransformTo() would do.  If you look a few lines up, there's a call to getTypeAction(); that code is supposed to handle cases where the type would fit into a single vector register.  It probably makes sense to enhance it to be a bit more comprehensive in cases where there's more than one legalization step involved.

If we have a value with a legal element type that has more elements than the legal vector type, I guess we have to widen, then split: so `<vscale x 9 x i16>` would end up using `<vscale x 8 x i16>` components.


================
Comment at: llvm/lib/CodeGen/TargetLoweringBase.cpp:1437
+      NumIntermediates = VT.getVectorElementCount().Min /
+                         SubPartVT.getVectorElementCount().Min;
+      IntermediateVT = SubPartVT;
----------------
Is flooring division correct here?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80139/new/

https://reviews.llvm.org/D80139





More information about the llvm-commits mailing list