[PATCH] D158724: [AArch64][LoopVectorize] Add truncated store values to list of types for widening
David Sherwood via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 24 05:07:41 PDT 2023
david-arm added a comment.
Thanks @Rin - this looks like a useful improvement! In truncate-type-widening.ll we're clearly now choosing a more natural VF that we can generate code for more efficiently, in particular the step vector intrinsic call in the preheader.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5727
// Examine the stored values.
- if (auto *ST = dyn_cast<StoreInst>(&I))
+ if (auto *ST = dyn_cast<StoreInst>(&I)) {
T = ST->getValueOperand()->getType();
----------------
I wonder if we should also be asking if the store operand is loop invariant too? This would avoid tests changing such as lvm/test/Transforms/LoopVectorize/vplan-stress-test-no-explict-vf.ll. If the input is loop invariant then it's not really participating in the vector loop.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5729
T = ST->getValueOperand()->getType();
+ if (isa<TruncInst>(ST->getOperand(0))) {
+ auto *castTrunc = dyn_cast<CastInst>(ST->getOperand(0));
----------------
I think you can write this more simply as
```if (auto *Trunc = dyn_cast<TruncInst>(ST->getOperand(0)))
T = Trunc->getSrcTy();
```
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:5730
+ if (isa<TruncInst>(ST->getOperand(0))) {
+ auto *castTrunc = dyn_cast<CastInst>(ST->getOperand(0));
+ T = castTrunc->getSrcTy();
----------------
nit: Normally variables in LLVM start with a capital, i.e. `CastTrunc`
================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/truncate-type-widening.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 2
+; RUN: opt -S < %s -passes=loop-vectorize -force-vector-interleave=1 -mtriple aarch64-linux-gnu -mattr=+sve -sve-tail-folding=disabled 2>&1 | FileCheck %s
+
----------------
It looks like we're still using tail-folding despite passing in `-sve-tail-folding=disabled`, which I think is because the vectoriser knows the trip count is low. Perhaps you can just remove the flag?
================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/truncate-type-widening.ll:96
+
+for.cond.cleanup.loopexit: ; preds = %for.body
+ br label %for.cond.cleanup
----------------
I think you can delete this block and just let everything jump directly to `%for.cond.cleanup`
================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/truncate-type-widening.ll:102
+
+for.body: ; preds = %for.body.preheader, %for.body
+ %indvars.iv = phi i64 [ 0, %for.body.preheader ], [ %indvars.iv.next, %for.body ]
----------------
Could you rewrite the blocks in a more natural order, i.e. entry, for.body.preheader, for.body, for.cond.cleanup?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158724/new/
https://reviews.llvm.org/D158724
More information about the llvm-commits
mailing list