[llvm-commits] Patch for review: Speeding up ScheduleDAG computations

Fri Feb 29 15:49:38 PST 2008

Let's try to solve one problem at a time.

1. Roman's depth / height patch calculation is obviously goodness.  
Please commit that first.
2. It's probably safe to eliminate the "special tie-breaker" from top  
down sorting function.
3. We will need to fix up the sorting functions so strict ordering is  
preserved. We need to do so without breaking any of the existing test  
case. "return left->NodeNum < right->NodeNum;" doesn't. :-( What does  
"return false" mean? Is it favoring nodes that are inserted earlier?
4. It might be wise to add some debugging code so we can tell where  
things start diverge.
5. It would be nice to be able to rebalance the heap. Roman and David  
can fight it out when we get to that point. :-)

Sounds like a plan? Let's resolve this soon or I will start hacking on  
it again. :-)

Evan

On Feb 29, 2008, at 1:44 PM, Roman Levenstein wrote:

> Hi David, Hi Evan,
>
> 2008/3/1, Evan Cheng <evan.cheng at apple.com>:
>>
>> On Feb 29, 2008, at 12:29 PM, David Greene wrote:
>>
>>> On Friday 29 February 2008 11:27, Roman Levenstein wrote:
>>>
>>>> Now,  a new problem I have descovered:
>>>> While improving my patch, I also looked at the LatencyPriorityQueue
>>>> implementation and  the RegReductionPriorityQueue implementation.
>>>> Both
>>>> are using priority queues and have an extremely inefficient  
>>>> remove()
>>>> method, that consumes 90% of the compilation time on very big BBs.
>>>> So,
>>>> I changed the data structure from the priority queue to the ordered
>>>> std::set. Obviously, the max element is basically the last  
>>>> element of
>>>> the set. This change produces a 3x speedup of compilation time on
>>>> huge
>>>> basic blocks. But there is a problem (or a bug!):
>>>
>>> The priority queue implementation is simply broken.  The heap
>>> properties
>>> are not properly maintained.  I proposed a patch to fix this some
>>> time ago
>>> but it is still in limbo.  It also cleans up the remove()
>>> infefficiency but
>>> trades it off for time to recompute the heap.  I've found it to be
>>> close to a
>>> wash, slower on some codes and faster on others.
>>>
>>> Evan, what do I need to do to get that patch into acceptable
>>> condition?
>>
>>
>> Please resubmit it again. I'll evaluate the compile time penalty  
>> today.
>
> David, I guess you mean this patch?
> http://www.nabble.com/RFC:-GLIBCXX_DEBUG-ScheduleDAG-Patch-td14382214.html
>
> Looking at it, I'd assume it is not extremely efficient since you
> still use the priority queue and rebuild it, when you remove elements.
>
> I'm pretty much sure that using std::set or any other efficient sorted
> set implementation (binary tree,etc) would perform better, if we need
> to delete elements from the middle of the heap and update priorities
> (which can be done by means of remove,change,insert). My preliminary
> tests done using std::set instead of priority queues for td-sched
> indicate a 3x performance boost.
>
>>>
>>>
>>>> the comparison operator
>>>> td_ls_rr_sort::operator()(const SUnit* left, const SUnit* right)  
>>>> (see
>>>> ScheduleDAGRRList.cpp)
>>>> seems to be implemented incorrectly (see the code below).
>>>
>>> Heh.  Yep, this looks familiar.
>
> Good that you confirm it.
>
>>>> Namely, I discovered by debugging two specific values L and R,  
>>>> where
>>>> L!=R, so that this operator returns FALSE (by return statement at  
>>>> the
>>>> very end of the method) for both pair (L,R) and (R,L). That means
>>>> that
>>>> L >= R and R >=L and L!=R at the same time, which cannot be.  This
>>>> incorrect comparison operator breaks all set operations based on  
>>>> the
>>>> ordering of elements, e.g. find() and insert(). Actually, I'm
>>>> wondering why it didn't break the priority queue operations before.
>>>
>>> Configure with --enable-expensive-checks and you'll probably get an
>>> assert.
>>
>>>> The same operator for  bu_ls_rr_sort has a very long disclaimer
>>>> related to a similar issue:
>>>> // There used to be a special tie breaker here that looked for
>>>> // two-address instructions and preferred the instruction with a
>>>> // def&use operand.  The special case triggered diagnostics when
>>>> // _GLIBCXX_DEBUG was enabled because it broke the strict weak
>>>> // ordering that priority_queue requires. It didn't help much  
>>>> anyway
>>>> // because AddPseudoTwoAddrDeps already covers many of the cases
>>>> // where it would have applied.  In addition, it's counter- 
>>>> intuitive
>>>> // that a tie breaker would be the first thing attempted.   
>>>> There's a
>>>> // "real" tie breaker below that is the operation of last resort.
>>>> // The fact that the "special tie breaker" would trigger when there
>>>> // wasn't otherwise a tie is what broke the strict weak ordering
>>>> // constraint.
>>>
>>> Yep, I believe Dan Gohman fixed this one.
>>>
>>>> So, there seems to be a problem. But I don't have a solution for
>>>> it. I need you help to understand how to fix it.
>>>
>>> It's possible the compare bug was not caught earlier because
>>> std::poriority_queue doesn't check it with _GLIBCXX_DEBUG turned
>>> on.  I don't know that for a fact, though.
>
> Could be.
>
>>> I would think that eliminating the special tie breaker might be a
>>> solution,
>>> similarly to what was done in bu_ls_rr_sort.
>>
>>
>> There is something I am not grasping. I would think the problem can
>> still occur even if the special tie breaker is removed.  Take any two
>> nodes A, B, suppose all of the conditions evaluates to false the
>> function will returns false regardless if it's comparing A <= B or B
>> <= A, right? I would think the right is just to change the default  
>> from
>> return false;
>> to
>> return left->NodeNum < right->NodeNum;
>>
>> This forces a strict ordering between two nodes, no?
>
> Agreed, this should introduce a strict ordering and fix the problem.
>
> So, we need your fix to update the priorities, but I don't think that
> your data-structures are efficient. So, I'd like to ask you to wait
> until Monday, until I test my patch including your priority update
> changes. This would avoid any commit conflicts. Based on the
> performance test, we'll decide which approach should be committet. Is
> it OK with you? I'd be glad to submit the patch immediately, but I
> forgot my development machine in the office ;-(
>
> Regards,
>  Roman
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits