[PATCH] D28593: Update loop branch_weight metadata after loop rotation.

Xin Tong via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 12 14:55:43 PST 2017


On Thu, Jan 12, 2017 at 2:40 PM, Dehao Chen <danielcdh at gmail.com> wrote:
> Are you suggesting changing it to:
>
> rotate_update_probability(float orig_prob) {
>   if (orig_prob >= 0 && orig_prob < 0.5) {
>     branch_prob_inside = 0;
>     branch_prob_outside = orig_prob;
>   } else {
>     branch_prob_inside = 2 - 1/orig_prob;
>     branch_prob_outside = 1;
>   }
> }

This would not work for the case which the orig_prob is bigger than
0.5. branch_prob_outside tells me when the guard block is hit, it will
always go to the exit.
>
> int estimate_trip_count(float latch_probability) {
>    if (latch_probability == 0)
>      return 0;
>    else
>      return 1/(1-latch_probability)
> }
>
> My gut feeling tells me this is a little dangerous. Micheal, what do you
> think?
>
> Dehao

My general feeling is.

1. We do not have all the information to do a very precise
distribution of the weight, and we will unlikely to be able to get
such information in short term (we need value profiling at least I
think or some other more clever approach). we made a decision to use
average trip count for loop peeling, assuming the loop tripcounts are
not polarized. Loop peeling will probably not be as effective if it
is.  So we make the same assumption when we rotate the loops and
update the !prof metadata.
2. We also consider the possibility of this assumption falling out, in
the case of loop peel, we basically peeled for some gains when the
iteration has a high trip count and some loss  (increase code size)
when the iterations has low trip count. I feel this is acceptable.  In
case of metadata update in loop rotation, if loop does have a very
polarized distribution. i.e. cases where a few iterations takes all
the trip counts and rest have close to zero. We basically
underestimate the weight of the branch from guard block to exit block,
as there should be more iterations of the loop that exits before even
one iteration is executed. This will propagate into the CFG and result
in us underestimating the average tripcount of the loops when it
actually executes (i.e. we have fewer iterations responsible for all
these body weight). This is OK, we become less aggressive in peeling,
we left some performance opportunities on the table, but only if the
loop distribution is polarized. And we may do a slightly worse job in
block placement.
3. In case the loop distribution is close to each other, the patch
handles it and distributes the weight in a reasonable way I feel.

-Xin
>
> On Thu, Jan 12, 2017 at 2:33 PM, Xin Tong <trent.xin.tong at gmail.com> wrote:
>>
>> On Thu, Jan 12, 2017 at 2:19 PM, Dehao Chen <danielcdh at gmail.com> wrote:
>> >
>> >
>> > On Thu, Jan 12, 2017 at 2:14 PM, Xin Tong <trent.xin.tong at gmail.com>
>> > wrote:
>> >>
>> >> On Thu, Jan 12, 2017 at 2:10 PM, Dehao Chen <danielcdh at gmail.com>
>> >> wrote:
>> >> >
>> >> >
>> >> > On Thu, Jan 12, 2017 at 2:02 PM, Xin Tong <trent.xin.tong at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> On Thu, Jan 12, 2017 at 1:34 PM, Dehao Chen <danielcdh at gmail.com>
>> >> >> wrote:
>> >> >> > If we assume that the trip count will always be the same, then if
>> >> >> > back
>> >> >> > edge
>> >> >> > taken probability falls in the range of (0, 0.5), then it already
>> >> >> > broke
>> >> >> > this
>> >> >> > assumption. So we do not have enough info to infer both branch
>> >> >> > probabilities. So probably we should simply set the inner
>> >> >> > probability
>> >> >> > as
>> >> >> > "0". And when we estimate loop trip count, we will always return 0
>> >> >> > if
>> >> >> > the
>> >> >> > latch probability is 0. Something like:
>> >> >> >
>> >> >> > rotate_update_probability(float orig_prob) {
>> >> >> >   branch_prob_outside = orig_prob;
>> >> >> >   if (orig_prob >= 0 && orig_prob < 0.5) {
>> >> >> >     branch_prob_inside = 0;
>> >> >> >   } else {
>> >> >> >     branch_prob_inside = 2 - 1/orig_prob;
>> >> >> >   }
>> >> >> > }
>> >> >> >
>> >> >> > int estimate_trip_count(float latch_probability) {
>> >> >> >    if (latch_probability == 0)
>> >> >> >      return 0;
>> >> >> >    else
>> >> >> >      return 1/(1-latch_probability)
>> >> >> > }
>> >> >>
>> >> >> We assume the trip count will be close to each other, they do not
>> >> >> have
>> >> >> to be identical =). And that assumption has implications, I
>> >> >> addressed
>> >> >> the implications when we have polarized loop tripcounts in an
>> >> >> earlier
>> >> >> response. I feel we need to make assumptions and also reason about
>> >> >> the
>> >> >> fallouts if that assumption turns out to be not true as Michael
>> >> >> said.
>> >> >
>> >> >
>> >> > Yes, the above code is trying to address one case where the
>> >> > assumption
>> >> > is
>> >> > untrue.
>> >> >
>> >> >>
>> >> >>
>> >> >> Back to your example, I am not sure i understand it fully, is
>> >> >> orig_prob the probability of branching outside before rotation ?
>> >> >
>> >> >
>> >> > Yes, orig_prob is the original probability of the branch before loop
>> >> > rotation.
>> >>
>> >> If orig_prob == 0.5, the branch_prob_inside is 0. (branch_prob_inside
>> >> is the probability that the loop body is branched to before rotation
>> >> ?).
>> >
>> >
>> > Sorry about the confusion. Let me use my original example:
>> >
>> > Before loop rotation
>> >
>> > L1:
>> > if (cond) { // "orig_prob"
>> >   stmt;
>> >   goto L1;
>> > }
>> >
>> > After loop rotation:
>> >
>> > if (cond) { // "branch_prob_outside"
>> > L1:
>> >   stmt;
>> >   if (cond) { // "branch_prob_inside" // and also "latch_probability"
>> >     goto L1;
>> >   }
>> > }
>> >
>> > Note that all probability are for when "cond" is true.
>> >
>> > Thanks,
>> > Dehao
>>
>> Thanks for the code. One thing I notice is that branch_prob_outside
>> probably should not be orig_prob. orig_prob before rotation is the (#
>> of times the exit is taken)/(# of times the header branch is hit
>> through forward and back edges). So if the loop has a cmp i32 %index,
>> 127 and recurrence of +1 (index = index + 1), i.e. the loop has a trip
>> count of 128, then branch_prob_outside = orig_prob = 1/128.  However,
>> branch_prob_outside should really be 0 as the loop will always
>> execute, i..e branch_prob_outside in the guard block will never be
>> taken.
>> >
>> >
>> >>
>> >> Thanks,
>> >> -Xin
>> >> >
>> >> >>
>> >> >>
>> >> >> Thanks,
>> >> >> -Xin
>> >> >>
>> >> >> >
>> >> >> >
>> >> >> > On Thu, Jan 12, 2017 at 12:59 PM, Michael Kuperstein via
>> >> >> > Phabricator
>> >> >> > <reviews at reviews.llvm.org> wrote:
>> >> >> >>
>> >> >> >> mkuper added inline comments.
>> >> >> >>
>> >> >> >>
>> >> >> >> ================
>> >> >> >> Comment at: lib/Transforms/Scalar/LoopRotation.cpp:476
>> >> >> >> +  // data.
>> >> >> >> +  if (!updateLoopEstimatedBranchWeight(L, GBI, BI /* OrigHeader
>> >> >> >> BR
>> >> >> >> */)) {
>> >> >> >> +    BI->setMetadata(LLVMContext::MD_prof, nullptr);
>> >> >> >> ----------------
>> >> >> >> Also, regardless of the rest of the discussion - I don't think we
>> >> >> >> should
>> >> >> >> drop the metadata on the floor if we fail.
>> >> >> >>
>> >> >> >> I don't think "No data is better than imprecise data" is right in
>> >> >> >> the
>> >> >> >> general case, but that's arguable. Specifically here, though,
>> >> >> >> we're
>> >> >> >> imprecise even if updateLoopEstimatedBranchWeight() succeeds,
>> >> >> >> because
>> >> >> >> of the
>> >> >> >> assumptions we make on the distribution.
>> >> >> >>
>> >> >> >>
>> >> >> >> https://reviews.llvm.org/D28593
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >
>> >> >
>> >> >
>> >
>> >
>
>


More information about the llvm-commits mailing list