[PATCH] D28593: Update loop branch_weight metadata after loop rotation.

Thu Jan 12 15:57:09 PST 2017

On Thu, Jan 12, 2017 at 3:37 PM, Michael Kuperstein <mkuper at google.com>
wrote:

>
>
> On Thu, Jan 12, 2017 at 2:55 PM, Xin Tong <trent.xin.tong at gmail.com>
> wrote:
>
>> On Thu, Jan 12, 2017 at 2:40 PM, Dehao Chen <danielcdh at gmail.com> wrote:
>> > Are you suggesting changing it to:
>> >
>> > rotate_update_probability(float orig_prob) {
>> >   if (orig_prob >= 0 && orig_prob < 0.5) {
>> >     branch_prob_inside = 0;
>> >     branch_prob_outside = orig_prob;
>> >   } else {
>> >     branch_prob_inside = 2 - 1/orig_prob;
>> >     branch_prob_outside = 1;
>> >   }
>> > }
>>
>> This would not work for the case which the orig_prob is bigger than
>> 0.5. branch_prob_outside tells me when the guard block is hit, it will
>> always go to the exit.
>>
>
I think "branch_prob_outside = 1" means that it will always *not* go to the
exit.

> >
>> > int estimate_trip_count(float latch_probability) {
>> >    if (latch_probability == 0)
>> >      return 0;
>> >    else
>> >      return 1/(1-latch_probability)
>> > }
>> >
>> > My gut feeling tells me this is a little dangerous. Micheal, what do you
>> > think?
>> >
>> > Dehao
>>
>> My general feeling is.
>>
>> 1. We do not have all the information to do a very precise
>> distribution of the weight, and we will unlikely to be able to get
>> such information in short term (we need value profiling at least I
>> think or some other more clever approach). we made a decision to use
>> average trip count for loop peeling, assuming the loop tripcounts are
>> not polarized. Loop peeling will probably not be as effective if it
>> is.  So we make the same assumption when we rotate the loops and
>> update the !prof metadata.
>>
>
> I agree.
>
>
>> 2. We also consider the possibility of this assumption falling out, in
>> the case of loop peel, we basically peeled for some gains when the
>> iteration has a high trip count and some loss  (increase code size)
>> when the iterations has low trip count. I feel this is acceptable.  In
>> case of metadata update in loop rotation, if loop does have a very
>> polarized distribution. i.e. cases where a few iterations takes all
>> the trip counts and rest have close to zero. We basically
>> underestimate the weight of the branch from guard block to exit block,
>> as there should be more iterations of the loop that exits before even
>> one iteration is executed. This will propagate into the CFG and result
>> in us underestimating the average tripcount of the loops when it
>> actually executes (i.e. we have fewer iterations responsible for all
>> these body weight). This is OK, we become less aggressive in peeling,
>> we left some performance opportunities on the table, but only if the
>> loop distribution is polarized. And we may do a slightly worse job in
>> block placement.
>>
>
> That's not exactly right. Underestimating can make us more aggressive, not
> less.
> Because we use a hard threshold for when to peel, underestimating the trip
> count
> can cause us to be more aggressive. I'm not sure this is a big problem,
> because
> if the threshold is chosen correctly, choosing to peel or not to peel when
> we're very
> close to the threshold should be a wash.
> Admittedly, that's a pretty big if. :-) But I don't think it's a huge
> problem.
>
> Dehao, why do you think this is dangerous? It's certainly somewhat ugly,
> but that's a separate issue.
>

That's just a gut feeling ;-) It's just that setting a very biased
probability for something we are unsure of really scares me, especially
when the outside condition could be very hot. (For the inner condition,
setting it to 0 when we know the trip count is small is less scary because
we are sure that it is less hot than the outside condition).

But that might not be a concern anyway if 100% and 51% does not make a
difference to the final codegen (code layout etc.).

Dehao

>
>> 3. In case the loop distribution is close to each other, the patch
>> handles it and distributes the weight in a reasonable way I feel.
>>
>> -Xin
>> >
>> > On Thu, Jan 12, 2017 at 2:33 PM, Xin Tong <trent.xin.tong at gmail.com>
>> wrote:
>> >>
>> >> On Thu, Jan 12, 2017 at 2:19 PM, Dehao Chen <danielcdh at gmail.com>
>> wrote:
>> >> >
>> >> >
>> >> > On Thu, Jan 12, 2017 at 2:14 PM, Xin Tong <trent.xin.tong at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> On Thu, Jan 12, 2017 at 2:10 PM, Dehao Chen <danielcdh at gmail.com>
>> >> >> wrote:
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Jan 12, 2017 at 2:02 PM, Xin Tong <
>> trent.xin.tong at gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> On Thu, Jan 12, 2017 at 1:34 PM, Dehao Chen <danielcdh at gmail.com
>> >
>> >> >> >> wrote:
>> >> >> >> > If we assume that the trip count will always be the same, then
>> if
>> >> >> >> > back
>> >> >> >> > edge
>> >> >> >> > taken probability falls in the range of (0, 0.5), then it
>> already
>> >> >> >> > broke
>> >> >> >> > this
>> >> >> >> > assumption. So we do not have enough info to infer both branch
>> >> >> >> > probabilities. So probably we should simply set the inner
>> >> >> >> > probability
>> >> >> >> > as
>> >> >> >> > "0". And when we estimate loop trip count, we will always
>> return 0
>> >> >> >> > if
>> >> >> >> > the
>> >> >> >> > latch probability is 0. Something like:
>> >> >> >> >
>> >> >> >> > rotate_update_probability(float orig_prob) {
>> >> >> >> >   branch_prob_outside = orig_prob;
>> >> >> >> >   if (orig_prob >= 0 && orig_prob < 0.5) {
>> >> >> >> >     branch_prob_inside = 0;
>> >> >> >> >   } else {
>> >> >> >> >     branch_prob_inside = 2 - 1/orig_prob;
>> >> >> >> >   }
>> >> >> >> > }
>> >> >> >> >
>> >> >> >> > int estimate_trip_count(float latch_probability) {
>> >> >> >> >    if (latch_probability == 0)
>> >> >> >> >      return 0;
>> >> >> >> >    else
>> >> >> >> >      return 1/(1-latch_probability)
>> >> >> >> > }
>> >> >> >>
>> >> >> >> We assume the trip count will be close to each other, they do not
>> >> >> >> have
>> >> >> >> to be identical =). And that assumption has implications, I
>> >> >> >> addressed
>> >> >> >> the implications when we have polarized loop tripcounts in an
>> >> >> >> earlier
>> >> >> >> response. I feel we need to make assumptions and also reason
>> about
>> >> >> >> the
>> >> >> >> fallouts if that assumption turns out to be not true as Michael
>> >> >> >> said.
>> >> >> >
>> >> >> >
>> >> >> > Yes, the above code is trying to address one case where the
>> >> >> > assumption
>> >> >> > is
>> >> >> > untrue.
>> >> >> >
>> >> >> >>
>> >> >> >>
>> >> >> >> Back to your example, I am not sure i understand it fully, is
>> >> >> >> orig_prob the probability of branching outside before rotation ?
>> >> >> >
>> >> >> >
>> >> >> > Yes, orig_prob is the original probability of the branch before
>> loop
>> >> >> > rotation.
>> >> >>
>> >> >> If orig_prob == 0.5, the branch_prob_inside is 0.
>> (branch_prob_inside
>> >> >> is the probability that the loop body is branched to before rotation
>> >> >> ?).
>> >> >
>> >> >
>> >> > Sorry about the confusion. Let me use my original example:
>> >> >
>> >> > Before loop rotation
>> >> >
>> >> > L1:
>> >> > if (cond) { // "orig_prob"
>> >> >   stmt;
>> >> >   goto L1;
>> >> > }
>> >> >
>> >> > After loop rotation:
>> >> >
>> >> > if (cond) { // "branch_prob_outside"
>> >> > L1:
>> >> >   stmt;
>> >> >   if (cond) { // "branch_prob_inside" // and also "latch_probability"
>> >> >     goto L1;
>> >> >   }
>> >> > }
>> >> >
>> >> > Note that all probability are for when "cond" is true.
>> >> >
>> >> > Thanks,
>> >> > Dehao
>> >>
>> >> Thanks for the code. One thing I notice is that branch_prob_outside
>> >> probably should not be orig_prob. orig_prob before rotation is the (#
>> >> of times the exit is taken)/(# of times the header branch is hit
>> >> through forward and back edges). So if the loop has a cmp i32 %index,
>> >> 127 and recurrence of +1 (index = index + 1), i.e. the loop has a trip
>> >> count of 128, then branch_prob_outside = orig_prob = 1/128.  However,
>> >> branch_prob_outside should really be 0 as the loop will always
>> >> execute, i..e branch_prob_outside in the guard block will never be
>> >> taken.
>> >> >
>> >> >
>> >> >>
>> >> >> Thanks,
>> >> >> -Xin
>> >> >> >
>> >> >> >>
>> >> >> >>
>> >> >> >> Thanks,
>> >> >> >> -Xin
>> >> >> >>
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > On Thu, Jan 12, 2017 at 12:59 PM, Michael Kuperstein via
>> >> >> >> > Phabricator
>> >> >> >> > <reviews at reviews.llvm.org> wrote:
>> >> >> >> >>
>> >> >> >> >> mkuper added inline comments.
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> ================
>> >> >> >> >> Comment at: lib/Transforms/Scalar/LoopRotation.cpp:476
>> >> >> >> >> +  // data.
>> >> >> >> >> +  if (!updateLoopEstimatedBranchWeight(L, GBI, BI /*
>> OrigHeader
>> >> >> >> >> BR
>> >> >> >> >> */)) {
>> >> >> >> >> +    BI->setMetadata(LLVMContext::MD_prof, nullptr);
>> >> >> >> >> ----------------
>> >> >> >> >> Also, regardless of the rest of the discussion - I don't
>> think we
>> >> >> >> >> should
>> >> >> >> >> drop the metadata on the floor if we fail.
>> >> >> >> >>
>> >> >> >> >> I don't think "No data is better than imprecise data" is
>> right in
>> >> >> >> >> the
>> >> >> >> >> general case, but that's arguable. Specifically here, though,
>> >> >> >> >> we're
>> >> >> >> >> imprecise even if updateLoopEstimatedBranchWeight() succeeds,
>> >> >> >> >> because
>> >> >> >> >> of the
>> >> >> >> >> assumptions we make on the distribution.
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> https://reviews.llvm.org/D28593
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >
>> >> >> >
>> >> >> >
>> >> >
>> >> >
>> >
>> >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170112/093044ad/attachment-0001.html>