[PATCH] D20789: Consecutive memory access in Loop Vectorizer - fixed and simplified
Elena Demikhovsky via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 7 23:41:53 PDT 2016
delena added inline comments.
================
Comment at: ../lib/Analysis/LoopAccessAnalysis.cpp:913-917
@@ -861,1 +912,7 @@
+ if (!AR)
+ // One more attempt to get stride from GEP with index calculated in runtime
+ if (auto *Gep = dyn_cast<GetElementPtrInst>(Ptr))
+ if (int Stride = getConsecutiveAccessFromGEPIndex(PSE, Gep, Lp, StridesMap))
+ return Stride;
+
----------------
anemet wrote:
> sbaranga wrote:
> > anemet wrote:
> > > delena wrote:
> > > > anemet wrote:
> > > > > delena wrote:
> > > > > > anemet wrote:
> > > > > > > Sorry for jumping in late but I was on vacation. Elena, does your testcase actually exercise this part of the code or it's handled as a regular AddRec above? I am assuming not since the Ptr is not a GEP.
> > > > > > >
> > > > > > > Do we actually have coverage exercising this part?
> > > > > > >
> > > > > > The test that exercises this code is in version-mem-access.ll. This is the existing test that fails without this extension.
> > > > > > My test case does not have GEP (but it has AddRec ) and this case wan't handled properly.
> > > > > OK and what is the SCEV that fails to become an AddRec after replaceSymbolicStrideSCEV?
> > > > >
> > > > > I am wondering if there is a fundamental reason we need getConsecutiveAccessFromGEPIndex. My guess is that it's this in getConsecutiveAccessFromGEPIndex:
> > > > >
> > > > > // Because of the multiplication by a stride we can have a s/zext cast.
> > > > > // We are going to replace this stride by 1 so the cast is safe to ignore.
> > > > >
> > > > > which does not seem fundamental.
> > > > About this specific case with GEP where we need to call getConsecutiveAccessFromGEPIndex():
> > > >
> > > > GEP itself is AddExpr, not AddRecExpr as expected. AddExpr is (base + MulExpr).
> > > > The MulExpr is (AddRec_of_induction * sext(Stride)).
> > > > We have Stride is equal to 1, but it is not compile time constant, but runtime constant.
> > > > something like this:
> > > > if (Stride ==1) then
> > > > execute vector-loop
> > > > else
> > > > execute scalar-loop
> > > >
> > > > After call to replaceSymbolicStrideSCEV it is converted to sext(AddRecExpr_with_step_1).
> > > >
> > > >
> > > >
> > > The sext is actually around the AddRec:
> > >
> > > (lldb) p PtrScev->dump()
> > > ((8 * (sext i32 {0,+,1}<nuw><%for.body> to i64)) + %x)
> > >
> > > @sbaranga, why can't predicated SCEV turn this into i64 {%x,+,8}?
> > @anemet: It should. If it doesn't I would consider it a bug. What is the input where this is triggered? Is it the added test case?
> >
> > @delena: the expression for that GEP should get folded into an AddRecExpr by SCEV.
> >
> > I'll have a look at this tomorrow. We should avoid versioning on the stride if possible, since that removes a lot of the input space.
> @sbaranga, no. I applied the patch and then ran the old testcase of Transforms/LoopVectorize/version-mem-access.ll.
>
@sbaranga, this GEP may be folded to AddRecExpr by PSE.
I can call to getPtrStride() with Assume=true and PSE will be invoked, but in this case 2 ARM tests fail:
LLVM :: Transforms/LoopVectorize/AArch64/gather-cost.ll
LLVM :: Transforms/LoopVectorize/ARM/gather-cost.ll
Repository:
rL LLVM
http://reviews.llvm.org/D20789
More information about the llvm-commits
mailing list