[PATCH] D76078: [AArch64][SVE] Add a pass for SVE intrinsic optimisations

Kerry McLaughlin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 20 09:11:07 PDT 2020


kmclaughlin added a comment.

Thanks for reviewing this, @efriedma & @andwar!



================
Comment at: llvm/lib/Target/AArch64/AArch64TargetMachine.cpp:441
+  // Expand any SVE vector library calls that we can't code generate directly.
+  bool ExpandToOptimize = (TM->getOptLevel() != CodeGenOpt::None);
+  if (EnableSVEIntrinsicOpts && TM->getOptLevel() == CodeGenOpt::Aggressive)
----------------
efriedma wrote:
> unused bool?
Removed


================
Comment at: llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp:120
+    if (!Reinterpret ||
+        RequiredType != Reinterpret->getArgOperand(0)->getType())
+      return false;
----------------
andwar wrote:
> Isn't it guaranteed that `RequiredType == Reinterpret->getArgOperand(0)->getType()` is always true? I.e., `PN` and the incoming values have identical type.
The incoming values to `PN` will all have the same type, but this is making sure that the reinterprets are all converting from the same type (there is a test for this in sve-intrinsic-opts-reinterpret.ll called `reinterpret_reductions_1`, where the arguments to convert.to.svbool are a mix of nxv2i1 and nxv4i1)


================
Comment at: llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp:224
+  bool Changed = false;
+  for (auto II = BB->begin(), IE = BB->end(); II != IE;) {
+    Instruction *I = &(*II);
----------------
andwar wrote:
> 1. Could this be a for-range loop instead?
> 
> 2. This loop seems to be a perfect candidate for `make_early_inc_range` (https://github.com/llvm/llvm-project/blob/172f1460ae05ab5c33c757142c8bdb10acfbdbe1/llvm/include/llvm/ADT/STLExtras.h#L499), e.g.
> ```
>  for (Instruction &I : make_early_inc_range(BB))
>   Changed |= optimizeIntrinsic(&I);
> ```
Changed this to use `make_early_inc_range` as suggested


================
Comment at: llvm/lib/Target/AArch64/SVEIntrinsicOpts.cpp:234
+  DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
+  bool Changed = false;
+
----------------
efriedma wrote:
> You might want to check whether the module actually declares any of the SVE intrinsics before you iterate over the whole function.
Thanks for the suggestion - I changed this to a module pass so that we can check if any of the SVE intrinsics we are interested in are declared first.


================
Comment at: llvm/test/CodeGen/AArch64/sve-intrinsic-opts-ptest.ll:18
+; OPT-LABEL: ptest_any2
+; OPT: %out = call i1 @llvm.aarch64.sve.ptest.any.nxv16i1(<vscale x 16 x i1> %1, <vscale x 16 x i1> %2)
+  %mask = tail call <vscale x 2 x i1> @llvm.aarch64.sve.ptrue.nxv2i1(i32 31)
----------------
andwar wrote:
> What's `%1` and `%2`? Is it worth adding the calls that generated them in the expected output?
I think that would make sense. I've added `%1` and `%2` to the expected output and added more checks to the other tests here.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76078/new/

https://reviews.llvm.org/D76078





More information about the llvm-commits mailing list