[PATCH] Loop Rerolling Pass

Hal Finkel hfinkel at anl.gov
Wed Oct 16 16:20:06 PDT 2013


----- Original Message -----
> 
> On Oct 16, 2013, at 3:37 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> > ----- Original Message -----
> >> 
> >> On Oct 16, 2013, at 1:14 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> >> 
> >>> ----- Original Message -----
> >>>> 
> >>>> 
> >>>> 
> >>>> On Oct 16, 2013, at 9:18 AM, Hal Finkel < hfinkel at anl.gov >
> >>>> wrote:
> >>>> 
> >>>> 
> >>>> 
> >>>> ----- Original Message -----
> >>>> 
> >>>> 
> >>>> 
> >>>> On 15 October 2013 22:11, Hal Finkel < hfinkel at anl.gov > wrote:
> >>>> 
> >>>> 
> >>>> 
> >>>> 
> >>>> 
> >>>> 
> >>>> I made use of SCEV everywhere that I could (I think). SCEV is
> >>>> used
> >>>> to
> >>>> analyze the induction variables, and then at the end to help
> >>>> with
> >>>> the rewriting. I don't think that I can use SCEV everywhere,
> >>>> however. For one thing, I need to check for equivalence of
> >>>> instructions, with certain substitutions, for instructions (like
> >>>> function calls) that SCEV does not interpret.
> >>>> 
> >>>> 
> >>>> Hi Hal,
> >>>> 
> >>>> 
> >>>> This is probably my lack of understanding of all that SCEV does
> >>>> than
> >>>> anything else.
> >>>> 
> >>>> 
> >>>> My comment was to the fact that you seem to be investigating
> >>>> specific
> >>>> cases (multiply, adding, increment size near line 240), which
> >>>> SCEV
> >>>> could get that as an expression, and possibly making it slightly
> >>>> easier to work with. I'll let other SCEV/LV experts to chime in,
> >>>> because I basically don't know what I'm talking about, here. ;)
> >>>> 
> >>>> Okay, I see what you mean. The code in this block:
> >>>> if (Inc == 1) {
> >>>> // This is a special case: here we're looking for all uses
> >>>> (except
> >>>> for
> >>>> // the increment) to be multiplied by a common factor. The
> >>>> increment
> >>>> must
> >>>> // be by one.
> >>>> if (I->first->getNumUses() != 2)
> >>>> continue;
> >>>> 
> >>>> This code does not use SCEV because, IMHO, there is no need. It
> >>>> is
> >>>> looking for a very particular instruction pattern where the
> >>>> induction variable has only two uses: one which increments it by
> >>>> one
> >>>> (SCEV has already been used to determine that the increment is
> >>>> 1),
> >>>> and the other is a multiply by a small constant. It is to catch
> >>>> cases like this:
> >>>> 
> >>>> for (int i = 0; i < 500; ++i) {
> >>>> foo(3*i);
> >>>> foo(3*i+1);
> >>>> foo(3*i+2);
> >>>> }
> >>>> 
> >>>> And so, aside from the increment, all uses of the IV are via the
> >>>> multiply. If we find this pattern, then instead of attempting to
> >>>> classify all IV uses as functions of i, i+1, i+2, ... we attempt
> >>>> to
> >>>> classify all uses of the multiplied IV that way.
> >>>> 
> >>>> 
> >>>> 
> >>>> 
> >>>> I think the more general SCEV-based way to do this would be to
> >>>> recursively walk the def-use chains starting at phis, looking
> >>>> past
> >>>> simple arithmetic until reaching an IV users (see
> >>>> IVUsers::AddUsersIfInteresting). Then you group the users by
> >>>> their
> >>>> IV operand's SCEV expression. If the SCEVs advance by the same
> >>>> constant, then you have your unrolled iterations and it doesn't
> >>>> matter how the induction variable was computed.
> >>>> LSRInstance::CollectChains does something similar.
> >>> 
> >>> Thanks! Collecting all IV users may be overkill here, but this is
> >>> something that I should play with.
> >>> 
> >>> While I have your attention (hopefully), why does SCEV not have a
> >>> signed division representation? I suspect that it why SCEV won't
> >>> give be a backedge-taken count for a loop like:
> >>> 
> >>> for (int i = 0; i < n; i += 5) {
> >>>   ...
> >>> }
> >> 
> >> 
> >> I don’t think division by a negative divisor lends itself to
> >> algebraic simplification. Hacker’s guide might say something about
> >> this.
> >> 
> >> The reason you don’t get a trip count is that ‘i' might step
> >> beyond
> >> ’n’ and overflow. If ’n’ is a constant less than INT_MAX-4 then
> >> you
> >> get a trip count.
> >> 
> >> The NSW flags are supposed to handle this case. Did you lose them
> >> during loop unrolling?
> > 
> > I think that this is the key point. No, I tried this with optimized
> > code straight out of Clang and it did not work (I don't think that
> > it has anything to do with the loop body).
> 
> Right. How could I forget. This is the infamous case where we ignore
> NSW. The problem is that we don’t know how the backedge taken count
> will be used. We do know that if the loop exits via the current
> branch, it will be at iteration ’n'. However, we don’t know if the
> loop will continue iterating beyond that point.

But it seems, that being the case, we could still return 'n/5' for loops with only one exiting block? I thought that is what SE->hasLoopInvariantBackedgeTakenCount(L) was for. It would only say that there was a backedge-taken count if the loop structure was simple enough that there was one unambiguous answer. Is this just an implementation oversight, or are there additional complications?

> 
> So you could get a minimum taken count for a particular loop back
> edge in this case if we adapt the SCEV API to communicate properly.

I recall discussing this before, and so I apologize, but can you elaborate on what 'communicate properly' will entail?

> 
> The advantage of doing this within LoopVectorizer is that we can
> gather the loop preconditions and emit preheader checks when
> profitable.

At least in the current setup, the LoopVectorizer is too late; SLP vectorization will inhibit rerolling, and that comes first. Also, we'd like the ability to reroll non-vectorizable loops. We might be able to reuse part of the SLP vectorizer here, but that's another matter.

Thanks again,
Hal

> 
> -Andy
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory




More information about the llvm-commits mailing list