[PATCH] D28593: Update loop branch_weight metadata after loop rotation.

Thu Jan 12 16:18:13 PST 2017

Yes, that's what I meant.

On Thu, Jan 12, 2017 at 4:14 PM, Xin Tong <trent.xin.tong at gmail.com> wrote:

> On Thu, Jan 12, 2017 at 4:09 PM, Xin Tong <trent.xin.tong at gmail.com>
> wrote:
> > On Thu, Jan 12, 2017 at 3:37 PM, Michael Kuperstein <mkuper at google.com>
> wrote:
> >>
> >>
> >> On Thu, Jan 12, 2017 at 2:55 PM, Xin Tong <trent.xin.tong at gmail.com>
> wrote:
> >>>
> >>> On Thu, Jan 12, 2017 at 2:40 PM, Dehao Chen <danielcdh at gmail.com>
> wrote:
> >>> > Are you suggesting changing it to:
> >>> >
> >>> > rotate_update_probability(float orig_prob) {
> >>> >   if (orig_prob >= 0 && orig_prob < 0.5) {
> >>> >     branch_prob_inside = 0;
> >>> >     branch_prob_outside = orig_prob;
> >>> >   } else {
> >>> >     branch_prob_inside = 2 - 1/orig_prob;
> >>> >     branch_prob_outside = 1;
> >>> >   }
> >>> > }
> >>>
> >>> This would not work for the case which the orig_prob is bigger than
> >>> 0.5. branch_prob_outside tells me when the guard block is hit, it will
> >>> always go to the exit.
> >>> >
> >>> > int estimate_trip_count(float latch_probability) {
> >>> >    if (latch_probability == 0)
> >>> >      return 0;
> >>> >    else
> >>> >      return 1/(1-latch_probability)
> >>> > }
> >>> >
> >>> > My gut feeling tells me this is a little dangerous. Micheal, what do
> you
> >>> > think?
> >>> >
> >>> > Dehao
> >>>
> >>> My general feeling is.
> >>>
> >>> 1. We do not have all the information to do a very precise
> >>> distribution of the weight, and we will unlikely to be able to get
> >>> such information in short term (we need value profiling at least I
> >>> think or some other more clever approach). we made a decision to use
> >>> average trip count for loop peeling, assuming the loop tripcounts are
> >>> not polarized. Loop peeling will probably not be as effective if it
> >>> is.  So we make the same assumption when we rotate the loops and
> >>> update the !prof metadata.
> >>
> >>
> >> I agree.
> >>
> >>>
> >>> 2. We also consider the possibility of this assumption falling out, in
> >>> the case of loop peel, we basically peeled for some gains when the
> >>> iteration has a high trip count and some loss  (increase code size)
> >>> when the iterations has low trip count. I feel this is acceptable.  In
> >>> case of metadata update in loop rotation, if loop does have a very
> >>> polarized distribution. i.e. cases where a few iterations takes all
> >>> the trip counts and rest have close to zero. We basically
> >>> underestimate the weight of the branch from guard block to exit block,
> >>> as there should be more iterations of the loop that exits before even
> >>> one iteration is executed. This will propagate into the CFG and result
> >>> in us underestimating the average tripcount of the loops when it
> >>> actually executes (i.e. we have fewer iterations responsible for all
> >>> these body weight). This is OK, we become less aggressive in peeling,
> >>> we left some performance opportunities on the table, but only if the
> >>> loop distribution is polarized. And we may do a slightly worse job in
> >>> block placement.
> >>
> >>
> >> That's not exactly right. Underestimating can make us more aggressive,
> not
> >> less.
> >> Because we use a hard threshold for when to peel, underestimating the
> trip
> >> count
> >> can cause us to be more aggressive. I'm not sure this is a big problem,
> >> because
> >> if the threshold is chosen correctly, choosing to peel or not to peel
> when
> >> we're very
> >> close to the threshold should be a wash.
> >> Admittedly, that's a pretty big if. :-) But I don't think it's a huge
> >> problem.
> >
> > If the actual tripcount of the loop is 10 and we estimated to be 5, we
> > will peel it 5 iterations.
> > Its only when the
> > https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Utils/
> LoopUnrollPeel.cpp#L95
> > is false,  we do not peel, then if the underestimation breaks this
> > condition, the real number (bigger) will sure not cause the loop to be
> > peeled.
> >
> > i.e. we will never peel more than this threshold. so I feel we are
> > leaving opportunities on the table, instead of being too
> > aggressive.
> >
> > -Xin
>
> Ok, there is also the possiblity that we underestimate and peel a loop
> which we would not peel given the right tripcount.  I think thats what
> you meant.
> -Xin
> >>
> >> Dehao, why do you think this is dangerous? It's certainly somewhat ugly,
> >> but that's a separate issue.
> >>
> >>>
> >>> 3. In case the loop distribution is close to each other, the patch
> >>> handles it and distributes the weight in a reasonable way I feel.
> >>>
> >>> -Xin
> >>> >
> >>> > On Thu, Jan 12, 2017 at 2:33 PM, Xin Tong <trent.xin.tong at gmail.com>
> >>> > wrote:
> >>> >>
> >>> >> On Thu, Jan 12, 2017 at 2:19 PM, Dehao Chen <danielcdh at gmail.com>
> >>> >> wrote:
> >>> >> >
> >>> >> >
> >>> >> > On Thu, Jan 12, 2017 at 2:14 PM, Xin Tong <
> trent.xin.tong at gmail.com>
> >>> >> > wrote:
> >>> >> >>
> >>> >> >> On Thu, Jan 12, 2017 at 2:10 PM, Dehao Chen <danielcdh at gmail.com
> >
> >>> >> >> wrote:
> >>> >> >> >
> >>> >> >> >
> >>> >> >> > On Thu, Jan 12, 2017 at 2:02 PM, Xin Tong
> >>> >> >> > <trent.xin.tong at gmail.com>
> >>> >> >> > wrote:
> >>> >> >> >>
> >>> >> >> >> On Thu, Jan 12, 2017 at 1:34 PM, Dehao Chen <
> danielcdh at gmail.com>
> >>> >> >> >> wrote:
> >>> >> >> >> > If we assume that the trip count will always be the same,
> then
> >>> >> >> >> > if
> >>> >> >> >> > back
> >>> >> >> >> > edge
> >>> >> >> >> > taken probability falls in the range of (0, 0.5), then it
> >>> >> >> >> > already
> >>> >> >> >> > broke
> >>> >> >> >> > this
> >>> >> >> >> > assumption. So we do not have enough info to infer both
> branch
> >>> >> >> >> > probabilities. So probably we should simply set the inner
> >>> >> >> >> > probability
> >>> >> >> >> > as
> >>> >> >> >> > "0". And when we estimate loop trip count, we will always
> >>> >> >> >> > return 0
> >>> >> >> >> > if
> >>> >> >> >> > the
> >>> >> >> >> > latch probability is 0. Something like:
> >>> >> >> >> >
> >>> >> >> >> > rotate_update_probability(float orig_prob) {
> >>> >> >> >> >   branch_prob_outside = orig_prob;
> >>> >> >> >> >   if (orig_prob >= 0 && orig_prob < 0.5) {
> >>> >> >> >> >     branch_prob_inside = 0;
> >>> >> >> >> >   } else {
> >>> >> >> >> >     branch_prob_inside = 2 - 1/orig_prob;
> >>> >> >> >> >   }
> >>> >> >> >> > }
> >>> >> >> >> >
> >>> >> >> >> > int estimate_trip_count(float latch_probability) {
> >>> >> >> >> >    if (latch_probability == 0)
> >>> >> >> >> >      return 0;
> >>> >> >> >> >    else
> >>> >> >> >> >      return 1/(1-latch_probability)
> >>> >> >> >> > }
> >>> >> >> >>
> >>> >> >> >> We assume the trip count will be close to each other, they do
> not
> >>> >> >> >> have
> >>> >> >> >> to be identical =). And that assumption has implications, I
> >>> >> >> >> addressed
> >>> >> >> >> the implications when we have polarized loop tripcounts in an
> >>> >> >> >> earlier
> >>> >> >> >> response. I feel we need to make assumptions and also reason
> >>> >> >> >> about
> >>> >> >> >> the
> >>> >> >> >> fallouts if that assumption turns out to be not true as
> Michael
> >>> >> >> >> said.
> >>> >> >> >
> >>> >> >> >
> >>> >> >> > Yes, the above code is trying to address one case where the
> >>> >> >> > assumption
> >>> >> >> > is
> >>> >> >> > untrue.
> >>> >> >> >
> >>> >> >> >>
> >>> >> >> >>
> >>> >> >> >> Back to your example, I am not sure i understand it fully, is
> >>> >> >> >> orig_prob the probability of branching outside before
> rotation ?
> >>> >> >> >
> >>> >> >> >
> >>> >> >> > Yes, orig_prob is the original probability of the branch before
> >>> >> >> > loop
> >>> >> >> > rotation.
> >>> >> >>
> >>> >> >> If orig_prob == 0.5, the branch_prob_inside is 0.
> >>> >> >> (branch_prob_inside
> >>> >> >> is the probability that the loop body is branched to before
> rotation
> >>> >> >> ?).
> >>> >> >
> >>> >> >
> >>> >> > Sorry about the confusion. Let me use my original example:
> >>> >> >
> >>> >> > Before loop rotation
> >>> >> >
> >>> >> > L1:
> >>> >> > if (cond) { // "orig_prob"
> >>> >> >   stmt;
> >>> >> >   goto L1;
> >>> >> > }
> >>> >> >
> >>> >> > After loop rotation:
> >>> >> >
> >>> >> > if (cond) { // "branch_prob_outside"
> >>> >> > L1:
> >>> >> >   stmt;
> >>> >> >   if (cond) { // "branch_prob_inside" // and also
> "latch_probability"
> >>> >> >     goto L1;
> >>> >> >   }
> >>> >> > }
> >>> >> >
> >>> >> > Note that all probability are for when "cond" is true.
> >>> >> >
> >>> >> > Thanks,
> >>> >> > Dehao
> >>> >>
> >>> >> Thanks for the code. One thing I notice is that branch_prob_outside
> >>> >> probably should not be orig_prob. orig_prob before rotation is the
> (#
> >>> >> of times the exit is taken)/(# of times the header branch is hit
> >>> >> through forward and back edges). So if the loop has a cmp i32
> %index,
> >>> >> 127 and recurrence of +1 (index = index + 1), i.e. the loop has a
> trip
> >>> >> count of 128, then branch_prob_outside = orig_prob = 1/128.
> However,
> >>> >> branch_prob_outside should really be 0 as the loop will always
> >>> >> execute, i..e branch_prob_outside in the guard block will never be
> >>> >> taken.
> >>> >> >
> >>> >> >
> >>> >> >>
> >>> >> >> Thanks,
> >>> >> >> -Xin
> >>> >> >> >
> >>> >> >> >>
> >>> >> >> >>
> >>> >> >> >> Thanks,
> >>> >> >> >> -Xin
> >>> >> >> >>
> >>> >> >> >> >
> >>> >> >> >> >
> >>> >> >> >> > On Thu, Jan 12, 2017 at 12:59 PM, Michael Kuperstein via
> >>> >> >> >> > Phabricator
> >>> >> >> >> > <reviews at reviews.llvm.org> wrote:
> >>> >> >> >> >>
> >>> >> >> >> >> mkuper added inline comments.
> >>> >> >> >> >>
> >>> >> >> >> >>
> >>> >> >> >> >> ================
> >>> >> >> >> >> Comment at: lib/Transforms/Scalar/LoopRotation.cpp:476
> >>> >> >> >> >> +  // data.
> >>> >> >> >> >> +  if (!updateLoopEstimatedBranchWeight(L, GBI, BI /*
> >>> >> >> >> >> OrigHeader
> >>> >> >> >> >> BR
> >>> >> >> >> >> */)) {
> >>> >> >> >> >> +    BI->setMetadata(LLVMContext::MD_prof, nullptr);
> >>> >> >> >> >> ----------------
> >>> >> >> >> >> Also, regardless of the rest of the discussion - I don't
> think
> >>> >> >> >> >> we
> >>> >> >> >> >> should
> >>> >> >> >> >> drop the metadata on the floor if we fail.
> >>> >> >> >> >>
> >>> >> >> >> >> I don't think "No data is better than imprecise data" is
> right
> >>> >> >> >> >> in
> >>> >> >> >> >> the
> >>> >> >> >> >> general case, but that's arguable. Specifically here,
> though,
> >>> >> >> >> >> we're
> >>> >> >> >> >> imprecise even if updateLoopEstimatedBranchWeight()
> succeeds,
> >>> >> >> >> >> because
> >>> >> >> >> >> of the
> >>> >> >> >> >> assumptions we make on the distribution.
> >>> >> >> >> >>
> >>> >> >> >> >>
> >>> >> >> >> >> https://reviews.llvm.org/D28593
> >>> >> >> >> >>
> >>> >> >> >> >>
> >>> >> >> >> >>
> >>> >> >> >> >
> >>> >> >> >
> >>> >> >> >
> >>> >> >
> >>> >> >
> >>> >
> >>> >
> >>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170112/e860bb67/attachment.html>