[PATCH] D29473: [AMDGPU] Unroll preferences improvements
Stanislav Mekhanoshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 3 10:17:05 PST 2017
rampitec marked 4 inline comments as done.
rampitec added inline comments.
================
Comment at: llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:69
+ continue;
+ if (any_of(L->getSubLoops(), [Inst](const Loop* SubLoop) {
+ return SubLoop->contains(Inst); }))
----------------
vpykhtin wrote:
> Why is this check nessesary? Is this an early exit when GEP is dependent on more than 1 induction variable?
This is just a check that real dependency is in the inner loop, which really needs to be unrolled. When you iterate through blocks a a loop you will get those belonging to inner loops as well. I'm just checking we are about to unroll a right one.
================
Comment at: llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp:88
// programs way too big.
- UP.Threshold = 800;
+ UP.Threshold = UnrollThresholdPrivate;
+ return;
----------------
tstellarAMD wrote:
> vpykhtin wrote:
> > tstellarAMD wrote:
> > > Do you also want to set PartialThreshold here?
> > I thought partialy unrolled loops won't make it possible to SROA private arrays. What are the benefits of partial unrolling on AMDGPU btw? What comes in mind: mem ops clustering/widening, less branches? What else?
> I had a test case where bumping the PartialThreshold helped more non-partial loops be unrolled, but I looked at the case again and increasing the normal Threshold has the same affect, so I don't think this is needed.
Actually I agree, partial unroll does not help to SROA an array. There can be other motivation, but not this.
Repository:
rL LLVM
https://reviews.llvm.org/D29473
More information about the llvm-commits
mailing list