[llvm-commits] Fwd: Patch for review: Speeding up ScheduleDAG computations

Tue Mar 4 23:08:39 PST 2008

Hi Dan,

2008/3/5, Dan Gohman <gohman at apple.com>:
>
>  On Mar 4, 2008, at 10:37 AM, Roman Levenstein wrote:
>  >>
>
> >> I thought those were approved? We understanding std::set is probably
>  >> the best choice of ADT for now. Thanks!
>  >
>  > OK. I'll commit them tomorrow.
>
>
> Great! Note that I did have one comment on the ScheduleDAGList.cpp
>  patch.

Sure, proposed changes from your comment will be in the patch.

>  > But are you sure about the second one for the ScheduleDAGRRList?
>  > I mean this: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20080303/059078.html
>
>
>
> I just took a look at this one. Here are my comments.
>
>   > +    // This may break the ordering in the priority queue.
>   >      PredSU->CycleBound = std::max(PredSU->CycleBound,
>   >                                    I->Dep->Cycle + PredSU->Latency);
>
>  This seems dangerous. After this is done, updateNode is called,
>  which attempts to remove the node from the set, but if the node has
>  changed
>  in a way that's significant to the comparison function, that remove
>  may not
>  work properly. It looks like the only way to be safe is to remove the
>  node
>  from the queue before making any modifications.

Good catch. I'll see how to fix this.

>   > +      for (unsigned i = 0, e = Nodes.size(); i != e; ++i) {
>   > +       Queue.insert(Nodes[i]);
>
>  > +      }
>
>  This could be simplified with
>
>              Queue.insert(Nodes.begin(), Nodes.end());
>
>   > +      SUnit *V = *Queue.rbegin();
>   > +      Queue.erase(--Queue.end(),Queue.end());
>
>  This could be simplified with
>
>              Queue.erase(prior(Queue.end());
>
>  or even
>
>              iterator i = prior(Queue.rbegin());
>
>              SUnit *V = *i;
>              Queue.erase(i);
>
>
> `prior' is defined in llvm/ADT/STLExtras.h.

Thanks! This makes the code cleaner.

>   > -  unsigned LBonus = (SumOfUnscheduledPredsOfSuccs(left) == 1) ?
>
> 2 : 0;
>   > -  unsigned RBonus = (SumOfUnscheduledPredsOfSuccs(right) == 1) ?
>  2 : 0;
>
>  > +  unsigned LBonus = (SumOfUnscheduledPredsOfSuccs(left,1) == 1) ?
>  2 : 0;
>   > +  unsigned RBonus = (SumOfUnscheduledPredsOfSuccs(right,1) == 1) ?
>  2 : 0;
>
>  It's not clear to me what this change is doing. Well, the original
>  code isn't
>  quite clear to me either :-}. Is this a heuristic change?

No, it does not change the semantic. I don't know what the original
code wanted to do. I just made it more efficient ;-)
SumOfUnscheduledPredsOfSuccs is called from the comparison operator
used by the priority queue, which is called rather often when any
operations on the queue are performed.  SumOfUnscheduledPredsOfSuccs
tries every time to compute the number of ALL unscheduled predecessors
of successors for a given node. And there can be hundreds/thousends of
them for big BBs (On some of my testcases this function was taking up
to 40-50% of the compilation time!). But then it compares this number
to 1, as you can see above. So I modified SumOfUnscheduledPredsOfSuccs
to compute only the number of predecessors until it reaches certain
limit/treshold, since it is enough for our goals. After this change,
SumOfUnscheduledPredsOfSuccs virtually does not consume any time at
all, profiler does not even show it, and the overall compilation time
is faster.

-Roman