[PATCH] D95363: [SVE][LoopVectorize] Add support for scalable vectorization of loops with vector reverse
Caroline via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 5 01:36:27 PST 2021
CarolineConcatto marked an inline comment as done.
CarolineConcatto added a comment.
Thank you all for the review.
I C&P the function getRunTime from D95139 <https://reviews.llvm.org/D95139>.
If D95139 <https://reviews.llvm.org/D95139> is merged before this patch I will remove it.
All the other dependencies are already in main, and getRunTimeVF is the only function that this patch needs from D95139 <https://reviews.llvm.org/D95139>.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2807
+ PartPtr->setIsInBounds(InBounds);
+ } else {
+ PartPtr = cast<GetElementPtrInst>(
----------------
david-arm wrote:
> bin.cheng-ali wrote:
> > Sorry for one nitpicking.
> > IIUC the difference for scalable/fixed cases is the two operands of GEPs, is it better to factor out common code by doing below?
> > if (VF.isScalable()) {
> > // build GEP operands for scalable case
> > }
> > else {
> > // build GEP operands for fixed case
> > }
> > // common code building VecPtr with above GEP operands.
> I guess alternatively we could commonise them into one block where we always calculate
>
> ``` Value *LastLane =
> Builder.CreateSub(Builder.getInt32(1),
> getRuntimeVF(Builder, Builder.getInt32Ty(), VF));```
>
> for both fixed-width and scalable vectors, and just let instcombine and other passes fold the constants? I'm not sure if that breaks lots of tests though ...
So @david-arm 's suggestion works for fixed width and scalable vectors.
At least no test has failed.
I could also have done the way @bin.cheng-ali suggested:
```
if (scalable){
NumElts = Builder.CreateMul(Builder.getInt32(1),
getRuntimeVF(Builder, Builder.getInt32Ty(), VF));
LastLane = Builder.CreateSub(Builder.getInt32(1),
getRuntimeVF(Builder, Builder.getInt32Ty(), VF));
}else{
NumElts = 1 * VF.getKnownMinValue()
LastLane = 1 - VF.getKnownMinValue()
}
```
But as all work with getRunTime I thought it was code duplication and this way the code looks simpler.
I added a comment to explain how it works if it is fixed-width vector for me in the future too
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:2818
if (isMaskRequired) // Reverse of a null all-one mask is a null mask.
BlockInMaskParts[Part] = reverseVector(BlockInMaskParts[Part]);
} else {
----------------
david-arm wrote:
> Hi @CarolineConcatto, I just realised there isn't a test for this for scalable vectors. Would you be able to add a test for blocks with masks too? I think it should be something like:
>
> ```void foo(int * __restrict__ a, int * __restrict__ cond, long long N) {
> for (int i = N - 1; i >= 0; i--)
> {
> if (cond[i])
> a[i] += 1;
> }
> }```
>
Thank you @david-arm for pointing that.
Yes, indeed it was missing test for that.
The cost model for i1 was missing with vector reverse.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D95363/new/
https://reviews.llvm.org/D95363
More information about the llvm-commits
mailing list