[PATCH] D136153: [AArch64] Allow cost computation for interleaved accesses

Wed Nov 16 08:44:49 PST 2022

paulwalker-arm added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:13761-13762
+  unsigned ElSize = DL.getTypeSizeInBits(VecTy->getElementType());
+  unsigned VecSize = (UseScalable || EC.isScalable())
+                         ? Subtarget->getMinSVEVectorSizeInBits()
+                         : 128;
----------------
The usage of `getMinSVEVectorSizeInBits` here is only relevant for fixed length vector types.  When working with scalable types we already know the size of the legal types (i.e. vscale * 128) and so the `: 128` part should be fine because `Scalable_EC/(vscale * 128)` ==> `EC.getKnownMinValue() / 128`).

Fixing this should mean you no longer require `vscale_range` attributes for your scalable vector tests.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:13782-13783
+
+  if (EC.isScalable())
+    return true;

----------------
Is this correct? Do you really want to say all scalable vector types are legal?

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2415-2416
+          Factor * TLI->getNumInterleavedAccesses(SubVecTy, DL, UseScalable);
+      if (isa<ScalableVectorType>(VecTy))
+        return InstructionCost::getInvalid();
+      return Cost;
----------------
This is worth a comment because the naive optimisation is to simply move the check higher up whereas you deliberately want to exercise as much of the costing code as possible, even though we'll currently ignore the result.

================
Comment at: llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp:2420-2422
+  // Calling the base implementation should only happen
+  // when we can scalarize the operation, what is not possible for
+  // scalable vectors.
----------------
Perhaps simplify to just "All other uses of scalable vectors are not legal."? With that said, looking at the base implementation of getInterleavedMemoryOpCost suggests it does the right thing so you could remove this code and just fall into the base version as before.

================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll:1-6
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt -mtriple=aarch64-none-linux-gnu -S -loop-vectorize -instcombine -force-vector-width=4 -force-vector-interleave=1 -enable-interleaved-mem-accesses=true -mattr=+sve -scalable-vectorization=on -runtime-memory-check-threshold=24 < %s | FileCheck %s
+
+; At the moment LLVM is not capable to vectorize interleaved accesses operating
+; on scalable vectors. This test is checking if the LV decides to use
+; gather/satter for such cases.
----------------
Does the code affect the output of these tests? Perhaps in this instance it is better to commit the tests first in isolation and then it's clear to see what effect this patch has (or rather, to show this patch does not change the existing behaviour).

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136153/new/

https://reviews.llvm.org/D136153