[PATCH] Loop Rerolling Pass

Thu Oct 17 09:30:19 PDT 2013

On Oct 17, 2013, at 6:25 AM, Hal Finkel <hfinkel at anl.gov> wrote:

> ----- Original Message -----
>> 
>> On Oct 16, 2013, at 4:20 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>> 
>>> ----- Original Message -----
>>>> 
>>>> On Oct 16, 2013, at 3:37 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>>> 
>>>>> ----- Original Message -----
>>>>>> 
>>>>>> On Oct 16, 2013, at 1:14 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>>>>> 
>>>>>>> ----- Original Message -----
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Oct 16, 2013, at 9:18 AM, Hal Finkel < hfinkel at anl.gov >
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> ----- Original Message -----
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 15 October 2013 22:11, Hal Finkel < hfinkel at anl.gov >
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I made use of SCEV everywhere that I could (I think). SCEV is
>>>>>>>> used
>>>>>>>> to
>>>>>>>> analyze the induction variables, and then at the end to help
>>>>>>>> with
>>>>>>>> the rewriting. I don't think that I can use SCEV everywhere,
>>>>>>>> however. For one thing, I need to check for equivalence of
>>>>>>>> instructions, with certain substitutions, for instructions
>>>>>>>> (like
>>>>>>>> function calls) that SCEV does not interpret.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Hi Hal,
>>>>>>>> 
>>>>>>>> 
>>>>>>>> This is probably my lack of understanding of all that SCEV
>>>>>>>> does
>>>>>>>> than
>>>>>>>> anything else.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> My comment was to the fact that you seem to be investigating
>>>>>>>> specific
>>>>>>>> cases (multiply, adding, increment size near line 240), which
>>>>>>>> SCEV
>>>>>>>> could get that as an expression, and possibly making it
>>>>>>>> slightly
>>>>>>>> easier to work with. I'll let other SCEV/LV experts to chime
>>>>>>>> in,
>>>>>>>> because I basically don't know what I'm talking about, here.
>>>>>>>> ;)
>>>>>>>> 
>>>>>>>> Okay, I see what you mean. The code in this block:
>>>>>>>> if (Inc == 1) {
>>>>>>>> // This is a special case: here we're looking for all uses
>>>>>>>> (except
>>>>>>>> for
>>>>>>>> // the increment) to be multiplied by a common factor. The
>>>>>>>> increment
>>>>>>>> must
>>>>>>>> // be by one.
>>>>>>>> if (I->first->getNumUses() != 2)
>>>>>>>> continue;
>>>>>>>> 
>>>>>>>> This code does not use SCEV because, IMHO, there is no need.
>>>>>>>> It
>>>>>>>> is
>>>>>>>> looking for a very particular instruction pattern where the
>>>>>>>> induction variable has only two uses: one which increments it
>>>>>>>> by
>>>>>>>> one
>>>>>>>> (SCEV has already been used to determine that the increment is
>>>>>>>> 1),
>>>>>>>> and the other is a multiply by a small constant. It is to
>>>>>>>> catch
>>>>>>>> cases like this:
>>>>>>>> 
>>>>>>>> for (int i = 0; i < 500; ++i) {
>>>>>>>> foo(3*i);
>>>>>>>> foo(3*i+1);
>>>>>>>> foo(3*i+2);
>>>>>>>> }
>>>>>>>> 
>>>>>>>> And so, aside from the increment, all uses of the IV are via
>>>>>>>> the
>>>>>>>> multiply. If we find this pattern, then instead of attempting
>>>>>>>> to
>>>>>>>> classify all IV uses as functions of i, i+1, i+2, ... we
>>>>>>>> attempt
>>>>>>>> to
>>>>>>>> classify all uses of the multiplied IV that way.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I think the more general SCEV-based way to do this would be to
>>>>>>>> recursively walk the def-use chains starting at phis, looking
>>>>>>>> past
>>>>>>>> simple arithmetic until reaching an IV users (see
>>>>>>>> IVUsers::AddUsersIfInteresting). Then you group the users by
>>>>>>>> their
>>>>>>>> IV operand's SCEV expression. If the SCEVs advance by the same
>>>>>>>> constant, then you have your unrolled iterations and it
>>>>>>>> doesn't
>>>>>>>> matter how the induction variable was computed.
>>>>>>>> LSRInstance::CollectChains does something similar.
>>>>>>> 
>>>>>>> Thanks! Collecting all IV users may be overkill here, but this
>>>>>>> is
>>>>>>> something that I should play with.
>>>>>>> 
>>>>>>> While I have your attention (hopefully), why does SCEV not have
>>>>>>> a
>>>>>>> signed division representation? I suspect that it why SCEV
>>>>>>> won't
>>>>>>> give be a backedge-taken count for a loop like:
>>>>>>> 
>>>>>>> for (int i = 0; i < n; i += 5) {
>>>>>>> ...
>>>>>>> }
>>>>>> 
>>>>>> 
>>>>>> I don’t think division by a negative divisor lends itself to
>>>>>> algebraic simplification. Hacker’s guide might say something
>>>>>> about
>>>>>> this.
>>>>>> 
>>>>>> The reason you don’t get a trip count is that ‘i' might step
>>>>>> beyond
>>>>>> ’n’ and overflow. If ’n’ is a constant less than INT_MAX-4 then
>>>>>> you
>>>>>> get a trip count.
>>>>>> 
>>>>>> The NSW flags are supposed to handle this case. Did you lose
>>>>>> them
>>>>>> during loop unrolling?
>>>>> 
>>>>> I think that this is the key point. No, I tried this with
>>>>> optimized
>>>>> code straight out of Clang and it did not work (I don't think
>>>>> that
>>>>> it has anything to do with the loop body).
>>>> 
>>>> Right. How could I forget. This is the infamous case where we
>>>> ignore
>>>> NSW. The problem is that we don’t know how the backedge taken
>>>> count
>>>> will be used. We do know that if the loop exits via the current
>>>> branch, it will be at iteration ’n'. However, we don’t know if the
>>>> loop will continue iterating beyond that point.
>>> 
>>> But it seems, that being the case, we could still return 'n/5' for
>>> loops with only one exiting block? I thought that is what
>>> SE->hasLoopInvariantBackedgeTakenCount(L) was for. It would only
>>> say that there was a backedge-taken count if the loop structure
>>> was simple enough that there was one unambiguous answer. Is this
>>> just an implementation oversight, or are there additional
>>> complications?
>>> 
>>>> 
>>>> So you could get a minimum taken count for a particular loop back
>>>> edge in this case if we adapt the SCEV API to communicate
>>>> properly.
>>> 
>>> I recall discussing this before, and so I apologize, but can you
>>> elaborate on what 'communicate properly' will entail?
>> 
>> I’m open to anything. For each branch exit we could distinguish
>> between a min vs. exact backedge taken count. We just have to be
>> careful how we present it to the public API and error on the side of
>> caution. If a user simply asks for the loop trip count, I don’t
>> think it’s correct to return ’n’, since subsequent iterations may
>> run before hitting undefined behavior. There have been bugs related
>> to this in the past.
>> 
>> If the client either asks for a minimum trip count, or the iteration
>> count at which we may observe a loop exit, then we can safely
>> provide an answer. In your case, you’re asking for an equivalent
>> loop test, so is it safe? I think it only works for you because you
>> know the loop contains no calls, so the program has no way to
>> terminate before hitting undefined behavior.
>> 
>> Maybe the high level SCEV interface should take a loop-may-terminate
>> parameter. The client can set this to false if it goes to the
>> trouble of proving it.
> 
> This sounds like a good idea. However, do all of these concerns not equally apply for a constant 'n'?

If the loop is testing less-than constant ’n’, I think we already handle it (knowing n < INT_MAX-stride). I’m not sure what we do for equals ’n’.

-Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131017/fed2fb52/attachment.html>